Mathematical Analysis 
Volume II 


Teo Lee Peng 


Mathematical Analysis 
Volume II 


Teo Lee Peng 


January 1, 2024 


Contents 


Contents 
Contents i 
Preface iv 
Chapter 1 Euclidean Spaces 1 
1.1 The Euclidean Space R” asa Vector Space........... 1 
12 Convergence of Sequences mI” .....-sc:k oe bee es oe 8 23 
1.3 Open Sets and Closed Sets . nk ck ee Ree eS a2 
1.4 Interior, Exterior, Boundary and Closure ............ 46 
1.5 Limit Points and Isolated Points ................ 1) 
Chapter 2 Limits of Multivariable Functions and Continuity 66 
2.1 Multivariable Functions ...................4. 66 
2.1.1 Polynomials and Rational Functions .......... 66 
2.1.2 Component Functions of a Mapping .......... 68 
2.1.3 Invertible Mappings .................. 69 
2.1.4 Linear Transformations. ................ 70 
215 (Quadratic Poms 2.6. ew ek wee es 74 
22 AamitSGl PUNCHONS 4... 2% do. oo Gow bo RES RES ES 79 
2 MOUs. ss, & oko we aos % atm we GOR We GTR of Be et OZ 
24 Uniform Conny 5c ¢ ado «0 6 eur bee 6 KREG KER KS 121 
22 ‘Contraction Mapping Theorem. 42.04 426 286046 24 127 
Chapter 3 Continuous Functions on Connected Sets and Compact Sets 132 
3.1 Path-Connectedness and Intermediate Value Theorem .... . 132 
3.2 Connectedness and Intermediate Value Property ........ 147 
3.3 Sequential Compactness and Compactness........... 161 
3.4 Applications of Compactness ...............-.-. 181 
3.4.1 ‘The Extreme Value Theorem ... 2... 5 +654 8 5%s 181 


3.4.2. Distance Between Sets ...............0.4. 184 


Contents 


il 


Chapter 4 
4.1 
4.2 


4.3 
4.4 
4.5 


Chapter 5 
«| 
i 
a 
5.4 


Chapter 6 
6.1 
6.2 
6.3 
6.4 
6.5 


6.6 
6.7 


34.3 Uniform Qonunuity . 0.5.4 S02 aoe s So aK es 
3.4.4 Linear Transformations and Quadratic Forms ..... 
34.5 Lebesgue NumberLemma ...42. 65% «$454 % 


Differentiating Functions of Several Variables 

Partial Derivatives <x. <2 ¢ eke <a ke Sere oa eRe wR 
Differentiability and First Order Approximation ........ 
42.\ JierenmtiaBiN.o «4 2448 o4 eee Hee wa 6 ws 
4.2.2 First Order Approximations .............. 
420 deneent Pisces: «6.4 d46n48 046844 644 oH 
4.2.4 Directional Derivatives ................. 
The Chain Rule and the Mean Value Theorem ......... 
Second Order Approximations ...4.45 44540 h seh «4 
Nicol Beireid sk 6. a He REPRE RIS RSE RSE RS-= 


The Inverse and Implicit Function Theorems 

‘The Inverse Function Theorem . .. 6.4668 285% 684s 
The Proof of the Inverse Function Theorem .......... 
‘The Implicit Function Theorem . 2. ..4.64 2k 24625624 
Extrema Problems and the Method of Lagrange Multipliers . . 


Multiple Integrals 

Riemann Integrals 2. 22th ee Re ee 
Properties of Riemann Integrals. ................ 
Jordan Measurable Sets and Riemann Integrable Functions 
Iterated Integrals and Fubini’s Theorem ............ 
Change of Variables Theorent ...- 2-4 e240 84 e 4-38 e< 
6.5.1 Translations and Linear Transformations ....... 
G32. FolarCoordinates...<3 244424446 448 44444 
G53 Sphercal Coordingiss 2. 2 gu oe as Se ae as 
G54. Other Examples. ..44.246 4484504604404 
Proof of the Change of Variables Theorem ........... 
Some Important Integrals and Their Applications ....... 


Contents ili 


Chapter 7 Fourier Series and Fourier Transforms 517 
7.1 Orthogonal Systems of Functions and Fourier Series ..... 518 
7.2 The Pointwise Convergence of a Fourier Series ........ 540 
7.3. The L* Convergence of a Fourier Series ............ 556 
7.4 The Uniform Convergence of a Trigonometric Series ..... 570 
J Feuger TiOistens 2. 2 eck eck eee eee ed Reed eee ee 586 
Appendix A Sylvester’s Criterion 615 
Appendix B_ Volumes of Parallelepipeds 622 
Appendix C Riemann Integrability 629 


References 642 


Preface iv 


Preface 


Mathematical analysis is a standard course which introduces students to rigorous 
reasonings in mathematics, as well as the theories needed for advanced analysis 
courses. It is a compulsory course for all mathematics majors. It is also strongly 
recommended for students that major in computer science, physics, data science, 
financial analysis, and other areas that require a lot of analytical skills. Some 
standard textbooks in mathematical analysis include the classical one by Apostol 
[Apo74] and Rudin [Rud76], and the modern one by Bartle [BS92], Fitzpatrick 
[Fit09], Abbott [Abb15], Tao [Tao16, Taol4] and Zorich [Zor15, Zor16]. 

This book is the second volume of the textbooks intended for a one-year course 
in mathematical analysis. We introduce the fundamental concepts in a pedagogical 
way. Lots of examples are given to illustrate the theories. We assume that students 
are familiar with the material of calculus such as those in the book [SCW20]. 
Thus, we do not emphasize on the computation techniques. Emphasis is put on 
building up analytical skills through rigorous reasonings. 

Besides calculus, it is also assumed that students have taken introductory 
courses in discrete mathematics and linear algebra, which covers topics such as 
logic, sets, functions, vector spaces, inner products, and quadratic forms. Whenever 
needed, these concepts would be briefly revised. 

In this book, we have defined all the mathematical terms we use carefully. 
While most of the terms have standard definitions, some of the terms may have 
definitions defer from authors to authors. The readers are advised to check the 
definitions of the terms used in this book when they encounter them. This can be 
easily done by using the search function provided by any PDF viewer. The readers 
are also encouraged to fully utilize the hyper-referencing provided. 


Teo Lee Peng 


Chapter 1. Euclidean Spaces 1 


Chapter 1 
Euclidean Spaces 


In this second volume of mathematical analysis, we study functions defined on 


subsets of IR”. For this, we need to study the structure and topology of R” first. 


We start by a revision on R” as a vector space. 


In the sequel, 7 is a fixed positive integer reserved to be used for R”. 


1.1 The Euclidean Space R” as a Vector Space 


If S), So, ..., Sp are sets, the cartesian product of these n sets is defined as the set 


S= 5, X+!*x Sp =] S;={(ai,--25 Gn) [ar €'S,, 1S F< no} 


w= 


that contains all n-tuples (a1,...,@,), where a; € S; foralll <i<n. 


The set IR” is the cartesian product of n copies of R. Namely, 


RS (Ey Hace Oy) | Cig tae eee Rp: 


The point (21, 2%2,...,2n) is denoted as x, whereas 11, 22,...,2, are called the 


components of the point x. We can define an addition and a scalar multiplication 


on R”. If x = (#1, %9,...,%p) and y = (y1, y2,---, Yn) are in R”, the addition of 


x and y is defined as 
x+y = (CePA i a ak 


In other words, it is a componentwise addition. Given a real number a, the scalar 


multiplication of a with x is given by the componentwise multiplication 


OX = (AX, U%2,..., In). 


The set R” with the addition and scalar multiplication operations is a vector 


space. It satisfies the 10 axioms for a real vector space V. 


Chapter 1. Euclidean Spaces 2 


The 10 Axioms for a Real Vector Space V 


Let V be a set that is equipped with two operations — the addition and the 
scalar multiplication. For any two vectors u and v in V, their addition is 


denoted by u+ v. For a vector u in V and a scalar a € R, the scalar 


multiplication of v by a is denoted by av. We say that V with the addition 
and scalar multiplication is a real vector space provided that the following 


10 axioms are satisfied for any u, v and w in V, and any a and 3 inR. 
Axiom 1 IfuandvareinV,thenu+visinV. 


Axiom 2 u+v=v+u. 


Axiom3 (u+v)+w=u+(v+w). 


Axiom 4 There is a zero vector 0 in V such that 


0O+v=v=v-+0 for allv € V. 


Axiom 5 For any v in V, there is a vector w in V such that 
v+w=O0=wev. 


The vector w satisfying this equation is called the negative of v, and is 
denoted by —v. 


Axiom 6 Forany vin V, and anya € R,avisin V. 


Axiom 7 a(u+v)=au-+av. 
Axiom8 (a+/()v =av-4 fv. 
Axiom9 = a(Sv) = (aB)v. 


Axiom 10 lv =v. 


IR" is areal vector space. The zero vector is the point O = (0,0,...,0) with 
all components equal to 0. Sometimes we also call a point x = (21,...,2,) in 


Chapter 1. Euclidean Spaces fe) 


R” a vector, and identify it as the vector from the origin O to the point x. 


Definition 1.1 Standard Unit Vectors 


IR”, there are n standard unit vectors e, ..., €, given by 


Let us review some concepts from linear algebra which will be useful later. 
Given that v,,..., Vv, are vectors in a vector space V, a linear combination of 
Vi,-.., Vz is a vector v in V of the form 


V=CVvy t+ + CeVE 


for some scalars c,,...,C;, Which are known as the coefficients of the linear 
combination. 

A subspace of a vector space V is a subset of V that is itself a vector space. 
There is a simple way to construct subspaces. 


Proposition 1.1 


Let V be a vector space, and let v;,..., Vv, be vectors in V. The subset 


W = {evi +--> + CRVE | C1,--., Ce € R} 


of V that contains all linear combinations of v,,..., Vv, is itself a vector 
space. It is called the subspace of V spanned by vj,..., Vx. 


Example 1.1 


In R®, the subspace spanned by the vectors e; = (1, 0,0) and e3 = (0, 0, 1) 


is the set W that contains all points of the form 
e100) 2( 0; 0, 1) =, 0,2). 


which is the 7z-plane. 


Next, we recall the concept of linear independence. 


Chapter 1. Euclidean Spaces 4 


Definition 1.2 Linear Independence 


Let V be a vector space, and let v,,...,Vv,% be vectors in V. We say that 


the set {v,,..., Vv} is a linearly independent set of vectors, or the vectors 


V1,---,V% are linearly independent, if the only k-tuple of real numbers 
(c1,..., Cx) which satisfies 


CyVi +--+ + CnVE = 0 


is the trivial k-tuple (ci,... 


Example 1.2 


IR”, the standard unit vectors e;,...,¢€,, are linearly independent. 


Example 1.3 

If V is a vector space, a vector v in V is linearly independent if and only if 
v #0. 

Example 1.4 


Let V be a vector space. Two vectors u and v in V are linearly independent 
if and only if u ¢ 0, v ¥ O, and there does not exists a constant a such that 


Vv = au. 


Let us recall the following definition for two vectors to be parallel. 


Definition 1.3 Parallel Vectors 


Let V be a vector space. Two vectors u and v in V are parallel if either 


u = O or there exists a constant a@ such that v = au. 


In other words, two vectors u and v in V are linearly independent if and only 
if they are not parallel. 


Chapter 1. Euclidean Spaces 5 


Example 1.5 


If S = {v,...,v,} is a linearly independent set of vectors, then for any 
S’ Cc S, S’ is also a linearly independent set of vectors. 


Now we discuss the concept of dimension and basis. 


Definition 1.4 Dimension and Basis 


Let V be a vector space, and let W be a subspace of V. If W can be 
spanned by k linearly independent vectors vj,..., Vv, in V, we say that W 
has dimension k. The set {v;,..., vz} is called a basis of W. 


Example 1.6 


In R”, the n standard unit vectors e;, ..., e,, are linearly independent and 


they span IR”. Hence, the dimension of R” is n. 


Example 1.7 


In R®, the subspace spanned by the two linearly independent vectors e; = 
(1,0,0) and e; = (0,0, 1) has dimension 2. 


Next, we introduce the translate of a set. 


Definition 1.5 Translate of a Set 


If A is a subset of R”, u is a point in R”, the translate of the set A by the 
vector u is the set 
A+u={a+ulae A}. 


Example 1.8 


In R®, the translate of the set A = {(z,y,0)|z,y € R} by the vector 
u = (0,0, —2) is the set 


B=A+ue= {(z,y,—-2)|z,y € R}. 


In IR”, the lines and the planes are of particular interest. They are closely 


Chapter 1. Euclidean Spaces 


related to the concept of subspaces. 


Definition 1.6 Lines in R" 


A line L in R” is a translate of a subspace of IR” that has dimension 1. As 


a set, it contains all the points x of the form 


X = Xo + tv, 


teER, 


where Xo is a fixed point in R”, and v is a nonzero vector in R”. The 


equation x = x9 + tv, t € R, is known as the parametric equation of the 


line. 


A line is determined by two points. 


Example 1.9 


Given two distinct points x; and x, in | 


IR”, the line LZ that passes through 


these two points have parametric equation given by 


x= x, +t(x9-X 


When 0 <t < 1,x =x, +t(xg — x1) 


segment with x; and x2 as endpoints. 


Ne eI 


describes all the points on the line 


Figure 1.1: A Line between two points. 


Chapter 1. Euclidean Spaces 7 


Definition 1.7 Planes in R” 


A plane W in R” is a translate of a subspace of dimension 2. As a set, it 


contains all the points x of the form 


X = Xo + tv, + toVo, tii © ik, 


where Xp is a fixed point in R”, and v, and v2 are two linearly independent 


vectors in IR”. 


Besides being a real vector space, IR” has an additional structure. Its definition 


is motivated as follows. Let P(x1, 72,73) and Q(y1, yo, y3) be two points in R®. 


By Pythagoras theorem, the distance between P and Q is given by 


PQ = V (a1 -— 1)? + (x2 — Yo)? + (23 — Ys)? 


P(x1, X2) 


Figure 1.2: Distance between two points in R?. 


Consider the triangle O PQ with vertices O, P, Q, where O is the origin. Then 
OP = 4/2? + 224+ 22, OQ = yi t+ yo + ¥3- 
Let 6 be the minor angle between OP and OQ. By cosine rule, 
PQ? = OP? + OQ? —2 x OP x OQ x cos. 
A straightforward computation gives 


OP? + OQ? — PQ = 2(r1y1 + Toyo + T3ys). 


Chapter 1. Euclidean Spaces 


Figure 1.3: Cosine rule. 


Hence, 
LY + LoY2 + L3Y3 


cos 6 = 


Vetta /ye+Btys 
It is a quotient of x,y; + r2y2 + £343 by the product of the lengths of OP and OQ. 


Generalizing the expression 71 y; + YoYy2 + %3y3 from 


(1.1) 


R? to 


R” defines the dot 


product. For any two vectors x = (#1, ¥2,...,Up) and y = (1, y2,---, Yn) in R”, 


the dot product of x and y is defined as 


xX ° y=) sy = XY. + V2Y2 +++ + LnYn- 


i=1 
This is a special case of an inner product. 


Definition 1.8 Inner Product Space 


A real vector space V is an inner product space if for any two vectors u and 


v in V, an inner product (u,v) of u and v is defined, and the following 


conditions for any u,v, w in V and a, § € Rare satisfied. 


1. (u,v) = (v,u). 


2. (au + bv, w) = a(u, w) + B(v, w). 


3. (v,v) > Oand (v,v) = 0 if and only if v = 0. 


Chapter 1. Euclidean Spaces 9 


Proposition 1.2 Euclidean Inner Product on R” 


R 


(x,y) =xX-y= S> ni = X1Y1 + LaYo +--+ + LnYn- 
= 


defines an inner product, called the standard inner product or the Euclidean 


inner product. 


Definition 1.9 Euclidean Space 


The vector space R” with the Euclidean inner product is called the 


Euclidean n-space. 


In the future, when we do not specify, R” always means the Euclidean n-space. 


One can deduce some useful identities from the three axioms of an inner 


product space. 


Proposition 1.3 


If V is an inner product space, then the following holds. 


(a) For any v € V, (0,v) = 0 = (v,0). 


(b) For any vectors v1,--- ,Vx%, Wi,::: , Ww, in V, and for any real numbers 


Q4,°°° JOR Cage vere 


k l 


k l 
my QAiVi; ‘ ams) = SS Se Og; a W;). 
i=1 j=l 


i=1 j=l 


Given that V is an inner product space, (v,v) > 0 for any v in V. For 


example, for any x = (21, ®2,...,2,) in R”, under the Euclidean inner product, 


n 
(x,x) = Soa? =a} t+ aR+-.- +02 >0. 
i=l 


When n = 3, the length of the vector OP from the point O(0,0,0) to the point 


Chapter 1. Euclidean Spaces 10 


Pit v2, r3) is 


OP =4/ a7 + 03 +23 = +f & x); where x = (974, 05,23). 


This motivates us to define to norm of a vector in an inner product space as 
follows. 


Definition 1.10 Norm of a Vector 


Given that V is an inner product space, the norm of a vector v is defined as 


iM ey aw): 


The norm of a vector in an inner product space satisfies some properties, which 


follow from the axioms for an inner product space. 


Proposition 1.4 


Let V be an inner product space. 


1. For any v in V, ||v|| > 0 and ||v|| = 0 if and only if v = 0. 


2. Foranya € Randv € V, |lav|| = Ja ||v|]. 


Motivated by the distance between two points in R*, we make the following 


definition. 


Definition 1.11 Distance Between Two Points 


Given that V is an inner product space, the distance between u and v in V 
is defined as 


d(u, v) = ||\v — ull = /(v —u,v—u). 


For example, the distance between the points x = (21,...,%,) andy = 


(Y1,-++;Yn) in the Euclidean space R” is 


Chapter |. Euclidean Spaces 11 


For analysis in R, an important inequality is the triangle inequality which says 


that |x + y| < |a|+ |y| for any x and y in R. To generalize this inequality to R”, 


we need the celebrated Cauchy-Schwarz inequality. It holds on any inner product 
space. 


Proposition 1.5 Cauchy-Schwarz Inequality 


Given that V is an inner product space, for any u and v in V, 
(u,v)| < full [ivi 


The equality holds if and only if u and v are parallel. 


It is obvious that if either u = O or v = O, 
0 = |ull|lvil, 


and so the equality holds. 
Now assume that both u and v are nonzero vectors. Consider the quadratic 
function f : R + R defined by 


f() = |ltu — v|/? = (tu — v, tu — v). 


Notice that f(t) = at? + bt + c, where 


a= (u,u)=|lull’, b= —2{u,v), ¢= (v,v) =|IvII’. 


The 3 axiom of an inner product says that f(t) > 0 for all t € R. Hence, 
we must have b? — 4ac < 0. This gives 


(u,v)* < |full?llvl’. 
Thus, we obtain the Cauchy-Schwarz inequality 


(u,v) < lull [ivi 


Chapter |. Euclidean Spaces 12 


The equality holds if and only if b? — 4ac = 0. The latter means that 


f(t) = 0 for some t = a, which can happen if and only if 
env — 0) 


or equivalently, v = au. 


Now we can prove the triangle inequality. 


Proposition 1.6 Triangle Inequality 


Let V be an inner product space. For any vectors v1, V2,...,V, in V, 


IS ats eet || 302 ame at 


It is sufficient to prove the statement when k = 2. The general case follows 


from induction. Given v, and v2 in V, 


I|va + voll? = (vi + vo, vi + V2) 
= (V1, V1) + 2(v1, V2) + (V2, v2) 
< |lvil|? + 2I|vallllvall + |vell? 
= (|Ivill + lvell)”. 


This proves that 
Ilvi + vall < [val] + Ilvell- 


From the triangle inequality, we can deduce the following. 


Corollary 1.7 


Let V be an inner product space. For any vectors u and v in V, 


[Hall — IIvll] < Juv]. 


Express in terms of distance, the triangle inequality takes the following form. 


Chapter 1. Euclidean Spaces 13 


Proposition 1.8 Triangle Inequality 


Let V be an inner product space. For any three points v1, V2, v3 in V, 


d(v1, V2) << d(v1, V3) + d(V2, V3). 


More generally, if v1, v2,..., Vv, are & vectors in V, then 


k 
d(v1, Vz) < S > d(vi-1, vi) = d(vi, V2) +--+ + d(vp-1, Ve). 


1=2 


Since we can define the distance function on an inner product space, inner 


product space is a special case of metric spaces. 


Definition 1.12 Metric Space 


Let X be a set, and letd : X x X — R bea function defined on X x X. 
We say that d is a metric on X provided that the following conditions are 
satisfied. 


1. For any x and y in X, d(x, y) > 0, and d(x, y) = 0 if and only if x = y. 


2. d(x,y) = d(y, x) for any x and y in X. 


3. For any x, y and z in X, d(z,y) < d(z, z) + d(y, z). 


If d is a metric on X, we say that (X, d) is a metric space. 


Metric spaces play important roles in advanced analysis. If V is an innner 


product space, it is a metric space with metric 
d(u,v) = ||v — ull. 


Using the Cauchy-Schwarz inequality, one can generalize the concept of angles 
to any two vectors in a real inner product space. If u and v are two nonzero vectors 


in a real inner product space V , Cauchy-Schwarz inequality implies that 


(u, v) 
l/l [Iv 


Chapter 1. Euclidean Spaces 14 


is areal number between —1 and 1. Generalizing the formula (1.1), we define the 
angle @ between u and v as 


6=cos ! Bia 
Ila] |v 


This is an angle between 0° and 180°. A necessary and sufficient condition for 
two vectors u and v to make a 90° angle is (u, v) = 0. 


Definition 1.13 Orthogonality 


Let V be a real inner product space. We say that the two vectors u and v in 


V are orthogonal if (u, v) = 0. 


Lemma 1.9 Generalized Pythagoras Theorem 


Let V be an inner product space. If u and v are orthogonal vectors in V , 
then 
Ju + vil? = |full? + [Iv 


Now we discuss the projection theorem. 


Theorem 1.10 Projection Theorem 


Let V be an inner product space, and let w be a nonzero vector in V. If v 
is a vector in V, there is a unique way to write v as a sum of two vectors v1 
and vz, such that v, is parallel to w and vz is orthogonal to w. Moreover, 


for any real number a, 


I[v — awl] > [lv — vill, 


and the equality holds if and only if a is equal to the unique real number 
such that v; = Ow. 


Chapter 1. Euclidean Spaces 15 


v2 


vi 


Figure 1.4: The projection theorem. 


Assume that v can be written as a sum of two vectors v; and v2, such that 
v, is parallel to w and v2 is orthogonal to w. Since w is nonzero, there 
is areal number ( such that v; = Gw. Since v2 = v — v; = v — bwis 


orthogonal to w, we have 
0 = (v — Bw, w) = (v, w) — Bw, w). 


This implies that we must have 


vi WEWiee Vg=V- Nol 
(w,w) 


(w, w) 

It is easy to check that v, and v2 given by these formulas indeed satisfy 
the requirements that v, is parallel to w and vy is orthogonal to w. This 
establishes the existence and uniqueness of vj and v2. 


Now for any real number a, 


v-—aw=v-vi+(6-a)w. 


Chapter 1. Euclidean Spaces 16 


Since v — v = Vz is orthogonal to (3 — a)w, the generalized Pythagoras 
theorem implies that 


llv — owl? = [lv — vill? + (6 — a) wl? 2 lv — vill’. 


This proves that 


|v — aw] > [lv — vill. 


The equality holds if and only if 
||(B — a) wl] = |a — B]||w|| = 0. 


Since ||w|| 4 0, we must have a = £3. 


The vector v, in this theorem is called the projection of v onto the subspace 
spanned by w. 

There is a more general projection theorem where the subspace W spanned by 
w is replaced by a general subspace. We say that a vector v is orthogonal to the 
subspace W if it is orthogonal to each vector w in W. 


Theorem 1.11 General Projection Theorem 


Let V be an inner product space, and let W be a finite dimensional subspace 
of V. If v is a vector in V, there is a unique way to write v as a sum of 
two vectors v, and Vo, such that v; isin W and vz is orthogonal to W. The 
vector v, is denoted by projy,v. For any w € W, 


Ilv — wl > |lv — projyvll, 


and the equality holds if and only if w = projy,v. 


Sketch of Proof 
If W is a k- dimensional vector space, it has a basis consists of k linearly 
independent vectors w,,...,W,. Since the vector v, is in W, there are 


constants c,,..., cj, such that 


Vy = CW, + °° + CEWe. 


Chapter 1. Euclidean Spaces 17 


The condition v2 = v — vj is orthogonal to W gives rise to k equations 


C1 (Wi, Wi) ges Ck( Wr, W1) > (Vv, W1), 


(12) 


C1 (Wi, We) +-++ + cK (We, We) = (V, We). 


Using the fact that w,,..., Ww, are linearly independent, one can show that 
the k x k matrix 


(W1,W1) +++ (We, Wi) 
A= 
(Wi,Wk) -** (Wr, We) 
is invertible. This shows that there is a unique c = (c1,...,c;,) satisfying 
the linear system (1.2). 


If V is an inner product space, a basis that consists of mutually orthogonal 


vectors are of special interest. 


Definition 1.14 Orthogonal Set and Orthonormal Set 


Let V be an inner product space. A subset of vectors S = {u,,..., ux} 
is called an orthogonal set if any two distinct vectors u; and u, in S are 
orthogonal. Namely, 


(uj, U;) =0 if 2 ft j. 


S is called an orthonormal set if it is an orthogonal set of unit vectors. 
Namely, 


if ij, 
ifi=j 


If S = {u,,...,u,} is an orthogonal set of nonzero vectors, it is a linearly 
independent set of vectors. One can construct an orthonormal set by normalizing 
each vector in the set. There is a standard algorithm, known as the Gram-Schmidt 


process, which can turn any linearly independent set of vectors {vi,...,v,} into 


Chapter 1. Euclidean Spaces 18 


an orthogonal set {u,,...,u,} of nonzero vectors. We start by the following 
lemma. 
Lemma 1.12 


Let V be an inner product space, and let S = {uj,...,u,} be an orthogonal 
set of nonzero vectors in V that spans the subspace W. Given any vector v 
inV, 


k 
projyyVv = oe aU is 


$= (uj, u;) 


By the general projection theorem, v = v, + V2, where v1 = projyV is in 
W and vz is orthogonal to W. Since S is a basis for W, there exist scalars 


C1, C2,..., Ce Such that v; = cyu, + --- + crux. Therefore, 
V=CuUy +--+ + CeUR + Vo. 


Since S is an orthogonal set of vectors and v2 is orthogonal to each u;, we 
find that for 1 <i< k, 


(v, uj) = c,(Uj, Uj). 
This proves the lemma. 


Theorem 1.13 Gram-Schmidt Process 


Let V be an inner product space, and assume that S = {vi,..., vx} is 
a linearly independent set of vectors in V. Define the vectors uj,..., Ux 
inductively by u, = vy, and for 2 <j <k, 


(Vj, Ui) 
(ui, uj) 


uj; = Vj = 
cI 


4 


Then S’ = {uj,..., ug} is a nonzero set of orthogonal vectors. Moreover, 
for each 1 < j < k, the set {u;|1 < i < j} spans the same subspace as 
theset {v, |< 7 9). 


Chapter 1. Euclidean Spaces 19 


Sketch of Proof 
For 1 < j <k, let W; be the subspace spanned by the set {v;|1 <i < j}. 
The vectors uj, ..., Uz are constructed by letting u; = vi, and for 2 <7 < 
k, 


u; = Vj = Ellie 3. Vale 


Since {vj,...,v,} is a linearly independent set, u; 4 0. Using induction, 
one can show that span{uj,...,u,;} = span{vi,...,v,}. By projection 
theorem, u, is orthogonal to W;_;. Hence, it is orthogonal to uj,..., Uj;—1. 
This proves the theorem. 


A mapping between two vector spaces that respect the linear structures is 
called a linear transformation. 
Definition 1.15 Linear Transformation 


Let V and W be real vector spaces. A mapping 7’: V — W is called 
a linear transformation provided that for any v,,..., Vv, in V, for any real 


numbers c;,..., Cx; 


T (cyvy + +++ + eev,) = CT (v1) +--+ + oT (vp). 


Linear transformations play important roles in multivariable analysis. In the 
following, we first define a special class of linear transformations associated to 
special projections. 


For 1 <i <n, let L; be the subspace of R” spanned by the unit vector e;. For 


the point = (4j,.<4....%%,)5 


projy,x = Lje;. 


The number 7; is the i-component of x. It will play important roles later. The 


mapping from x to x; 1s a function from R” to R. 


Definition 1.16 Projection Functions 


For 1 <i <n, the i'*-projection function on R” is the function 7; : 
R defined by 


Tg (| = Me (Lisnexg ln) — ee 


Chapter 1. Euclidean Spaces 20 


we —-——————--@ 
Y 


ba 


Figure 1.5: The projection functions. 


The following is obvious. 


Proposition 1.14 


For 1 <i <n, the 7""-projection function on R” is a linear transformation. 


Namely, for any x;,...,X, in R”, and any real numbers cj,..., cx, 


Ti (coat ap 80S SP CkXk) = Cima (1) ap eee ap Coa. Sn 


The following is a useful inequality. 


Proposition 1.15 


Let x be a vector in R”. Then 


[7i(X)| < [Ix]. 


At the end of this section, let us introduce the concept of hyperplanes. 


Definition 1.17 Hyperplanes 


In R”, a hyperplane is a translate of a subspace of dimension n — 1. In other 


words, HI is a hyperplane if there is a point xp in R”, and n — 1 linearly 


independent vectors vi, V2, ..., Vn—1 Such that H contains all points x of 


the form 


Xe — Xe U1 Valate se ai Ue Vie (j= Neo ty ie a 


Chapter |. Euclidean Spaces Pa 


A hyperplane in R! is a point. A hyperplane in R? is a line. A hyperplane in 


R? is a plane. 


Definition 1.18 Normal Vectors 


Let vi, V2, ..., Vn—1 be linearly independent vectors in R”, and let H 
the hyperplane 


H = {xo + tivi +---+tn-1Vn-1 | (t1,-.-;tn-1) € R™7}. 


A nonzero vector n that is orthogonal to all the vectors vj,...,Vn—1 is 


called a normal vector to the hyperplane. If x, and x2 are two points on H, 
then n is orthogonal to the vector v = x» — x,;. Any two normal vectors of 
a hyperplane are scalar multiples of each other. 


Proposition 1.16 


If H is a hyperplane with normal vector n = (a1, d2,...,@,), and xy = 


(u1,U2,.--,Un) is a point on H, then the equation of H is given by 
ay(X1 — U1) + a2(#2 — Ug) + +++ + Ap (2n — 
Conversely, any equation of the form 
G10, + Ag%q +--+ + AnXn = D 
is the equation of a hyperplane with normal vector n = (a1, d2,...,@n). 


Example 1.10 


Given 1 <7 < n, the equation 7; = cis a hyperplane with normal vector e;. 
It is a hyperplane parallel to the coordinate plane x; = 0, and perpendicular 
to the x;-axis. 


Chapter 1. Euclidean Spaces 22 


Exercises 1.1 


Question 1 


Let V be an inner product space. If u and v are vectors in V, show that 


[Ihall = [vl < flu — vip. 


Question 2 


Let V be an inner product space. If u and v are orthogonal vectors in V , 
show that 
lu + vil? = lull? + Iv. 


Question 3 


Let V be an inner product space, and let u and v be vectors in V. Show 


that 
ju + vl? — |lu— vl? 


(u,v) = evi 


Question 4 


Let V be an inner product space, and let {u;,...,u,} be an orthonormal 
set of vectors in V. For any real numbers q1,..., az, show that 


Jaya, +++ + agugll? =ar+---+a7. 


Question 5 


Let 71, £2,...,@, be real numbers. Show that 


(a) \/a7 + 03---+ 02 < Jar] + |vo| +--+ + lanl; 


(b) Jay tag t--++2n| < Vny/a? + 23---4 22. 
nm 


Chapter |. Euclidean Spaces 23 


1.2 Convergence of Sequences in R” 


A point in the Euclidean space R” is denoted by x = (21,22,...,2,). When 
n = 1, we just denote it by x. When n = 2 and n = 3, it is customary to denote a 


point in R? and R? by (z, y) and (2, y, z) respectively. 
The Euclidean inner product between the vectors x = (%1,%2,...,%n) and 
Y = (Yi, Ya,-++,Yn) is 


The norm of x is 


while the distance between x and y is 


d(x,y) = ||k—yl|| = 


i=1 


A sequence in R” is a function f : Z* + R”. Fork € Z*, let a, = f(k). 
Then we can also denote the sequence by {a;}72.,, or simply as {a; }. 


Example 1.11 


The sequence eigen is a Sequence in | 
q aes ae 4 


kK 2k+3 
i : 


In volume I, we have seen that a sequence of real numbers {a;,}?°, is said to 
converge to a real number a provided that for any ¢ > 0, there is a positive integer 
ke such that 

la, —a| <eé forallk > K. 


Notice that |a;, — a| is the distance between a, and a. To define the convergence 


of a sequence in IR”, we use the Euclidean distance. 


Chapter 1. Euclidean Spaces 24 


Definition 1.19 Convergence of Sequences 


A sequence {a;} in R” is said to converge to the point a in R” provided 


that for any ¢ > 0, there is a positive integer K so that for all k > K, 


la, — al] = d(ag, a) <. 


If {a,} is a sequence that converges to a point a, we say that the sequence 
{a;} is convergent. A sequence that does not converge to any point in R” 


is said to be divergent. 


L@ 


Figure 1.6: The convergence of a sequence. 
As in the n = 1 case, we have the following. 
Proposition 1.17 


A sequence in R” cannot converge to two different points. 


Definition 1.20 Limit of a Sequence 


If {a;} is a sequence in R” that converges to the point a, we call a the limit 


of the sequence. This can be expressed as 


limtay—ar 
k-o0o0 


The following is easy to establish. 


Chapter 1. Euclidean Spaces 2 


Proposition 1.18 


Let {a;,} be a sequence in R”. Then {a;} converges to a if and only if 


lim |/a, — al] = 0. 
k—00 


By definition, the sequence {a;,} is convergent if and only if for any < > 0, 
there is a positive integer A so that for all k > K, ||a, — al] < e. This is 


the definition of lim ||a, — al] = 0. 
k-00 


As in the n = 1 case, {a,, a is a subsequence of {a;} if ki, ko, k3,... is a 
strictly increasing subsequence of positive integers. 


Corollary 1.19 


If {a;,} is a sequence in R” that converges to the point a, then any 
subsequence of {a;, } also converges to a. 


Example 1.12 


Let us investigate the convergence of the sequence {a, } in] 


kK 2k+3 
k 


that is defined in Example 1.11. Notice that 


Se eee 


Wee: 
lim 7(a,) = lim a Z, 
k-00 


seq k 
It is natural for us to speculate that the sequence {a; } converges to the point 
ai 2), 


Chapter 1. Euclidean Spaces 26 


Fork € Zt, 


Thus, 


lla, — al] = 


By squeeze theorem, 


lim ||a, — al] = 0. 
k- oo 


This proves that the sequence {a; } indeed converges to the point a = (1, 2). 


In the example above, we guess the limit of the sequence by looking at each 
components of the sequence. This in fact works for any sequences. 


Theorem 1.20 Componentwise Convergence of Sequences 


A sequence {a;} in IR” converges to the point a if and only if for each 


1<i<n, the sequence {7;(a;)} converges to the point {7;(a)}. 


Given 1 <i<n, 


Ti(ax) — 1 (a) = 7; (az — a). 


Thus, 


|7i(ax) — 7i(a)| = |mi(ax — a)| < lax — al. 


If the sequence {a;,} converges to the point a, then 


lim ||a; — al] = 0. 
k- oo 


By squeeze theorem, 


This proves that the sequence {7;(a;)} converges to the point {7;(a) }. 


Chapter |. Euclidean Spaces 21 


Conversely, assume that for each 1 < i < n, the sequence {7;(a;)} 


converges to the point {7;(a)}. Then 


jim |7;(ax) — m(a)| = 0 for I< 76 10) 
00 
lac — all < > lmi(ax —a)l, 
i=1 
squeeze theorem implies that 
lim |la;, — al] = 0. 
k—+00 


This proves that the sequence {a;} converges to the point a. 


Theorem 1.20 reduces the investigations of convergence of sequences in R” to 


sequences in R. Let us look at a few examples. 


Example 1.13 


Find the following limit. 


Chapter |. Euclidean Spaces 28 


Example 1.14 


Let {a;,} be the sequence with 


nS (1 see 


Is the sequence convergent? Justify your answer. 


Solution 
The sequence {7;(a,)} is the sequence {(—1)*}, which is divergent. 
Hence, the sequence {a,,} is divergent. 


Using the componentwise convergence theorem, it is easy to establish the 
following. 


Proposition 1.21 Linearity 


Let {a,} and {b;} be sequences in R” that converges to a and b 
respectively. For any real numbers a and (3, the sequence {aa; + bb;} 
converges to aa + Sb. Namely, 


lim (aa, + Gb;,) = aa+t bb. 
k—+00 


Example 1.15 


If {a;,} is a sequence in R” that converges to a, show that 


Jim |faxl| = [lal 
[oe) 


Chapter 1. Euclidean Spaces 29 


Solution 
Notice that 


lal] = Vm (ak)? +--+ ta(ax)?. 


For 1 <i<n, 


lim 7;(a,) = 7;(a). 
k-0o 


Using limit laws for sequences in R, we have 


dim (m1(ax)” + +++ + tn(ax)’) = ma)? +++ + tala)’. 


Using the fact that square root function is continuous, we find that 


lim |laj|| = lim /71(az)? +--+ + t(ap)? 
k—+00 k—+00 


Vmi(a)? +--+ Mm(a)? = llall. 


There is also a Cauchy criterion for convergence of sequences in R”. 


Definition 1.21 Cauchy Sequences 


A sequence {a;} in R” is a Cauchy sequence if for every ¢ > 0, there is a 
positive integer K such that forall] >k > k, 


lay = a,|| < &-. 


Theorem 1.22 Cauchy Criterion 


A sequence {a;,} in R” is convergent if and only if it is a Cauchy sequence. 


Similar to the n = 1 case, the Cauchy criterion allows us to determine whether 


a sequence in IR” is convergent without having to guess what is the limit first. 


Chapter 1. Euclidean Spaces 30 


Assume that the sequence {a;,} converges to a. Given < > 0, there is a 
positive integer / such that for all k > K, ||a, — al| < ¢/2. Then for all 
EN ee IE 


lar — axl] < |lar — all + |lax — all < . 


This proves that {a;} is a Cauchy sequence. 
Conversely, assume that {a;,} is a Cauchy sequence. Given € > 0, there is 
a positive integer K such that for all] >k > K, 


lla, = a;|| Ke: 
For each 1 <i<n, 


|7;(a) — mi (ax)| = |7 (ar — ax)| < Jar — ag. 


Hence, {7;(a,)} is a Cauchy sequence in R. Therefore, it is convergent. 


By componentwise convergence theorem, the sequence {a;} is convergent. 


Chapter 1. Euclidean Spaces 31 


Exercises 1.2 


Question 1 


Show that a sequence in R” cannot converge to two different points. 


Question 2 


Find the limit of the sequence {a; }, where 


GS Jak? +k ( 2) | 
a, = 5 1+ — P 


k+3?’ k k 


Question 3 


Let {a;,} be the sequence with 


1k ” Ok 


(-1)* tk 1 ) 
Determine whether the sequence is convergent. 


Question 4 


Let {a;,} be the sequence with 


a= ( k k ) 
* NIGER Vk 41) 7 


Determine whether the sequence is convergent. 


Question 5 


Let {a,} and {b,} be sequences in R” that converges to a and b 
respectively. Show that 


lim (ag, bg) = (a,b). 
k-0o 


Here (x,y) = x - y is the standard inner product on R”. 


Chapter 1. Euclidean Spaces a2 


Question 6 
Suppose that {a;} is a sequence in R” that converges to a, and {c;,} is a 


sequence of real numbers that converges to c, show that 


hineceay = ca. 
k- oo 


Question 7 


Suppose that {a;,} is a sequence of nonzero vectors in R” that converges to 
a anda + 0, show that 


Question 8 


Let {a;,} and {b;,,} be sequences in R”. If {a;,} is convergent and {b;} is 
divergent, show that the sequence {a;, + b;} is divergent. 


Question 9 


Suppose that {a;,} is a sequence in R” that converges to a. If r = ||al] 4 0, 
show that there is a positive integer K such that 


llagl| > 5 for all k > K. 


Question 10 


Let {a;,} be a sequence in R” and let b be a point in R”. Assume that the 
sequence {a;} does not converge to b. Show that there is an ¢ > 0 anda 
subsequence {a,,} of {a;} such that 


l|az; — bl] > for all 7 € Z*. 


Chapter 1. Euclidean Spaces ao 


1.3. Open Sets and Closed Sets 


In volume I, we call an interval of the form (a, b) an open interval. Given a point 


x in R, a neighbourhood of x is an open interval (a,b) that contains x. Given a 


subset S of R, we say that x is an interior point of S if there is a neighboirhood 


of x that is contained in S. We say that S is closed in R provided that if {a;,} is 


a sequence of points in S that converges to a, then a is also in S. These describe 


the topology of R. It is relatively simple. 


For n > 2, the topological features of R” are much more complicated. 


An open interval (a, b) in R can be described as a set of the form 


B={xER||x —2x| <r}, 


b b— 
where tp = “and r = 5 = 
Xo-T Xo to+rT 


Figure 1.7: An open interval. 


Generalizing this, we define open balls in R”. 


Definition 1.22 Open Balls 


Given Xp in R” and r > 0, an open ball B(xo, 1) of radius r with center at 


Xo is a Subset of R” of the form 


B(xo,r) = {x € R” | |x — xol| < r}. 


It consists of all points of IR” whose distance to the center Xo is less than r. 


Obviously, it 0 < ry < rg, then B(xo,71) C B(x, 72). The following is a 
useful lemma for balls with different centers. 


Chapter 1. Euclidean Spaces 34 


Figure 1.8: An open ball. 


Lemma 1.23 


Let x; be a point in the open ball B(xo, 1). Then ||x; — xo|| < r. If, isa 


positive number satisfying 
ri ee =) |pa= xo), 


then the open ball B(x, 71) is contained in the open ball B(xo,7). 


Figure 1.9: An open ball containing another open ball with different center. 


Let x be a point in B(x), 11). Then 


[FS are re | || 


Chapter 1. Euclidean Spaces 35 


By triangle inequality, 


Ix — oll < [x — x1] + [lx1 — oll <r. 


Therefore, x is a point in B(xo,1r). This proves the assertion. 


Now we define open sets in R”. 


Definition 1.23 Open Sets 


Let S be a subset of IR”. We say that S is an open set if for each x € S, 
there is a ball B(x, r) centered at x that is contained in S. 


The following example justifies that an open interval of the form (a,b) is an 
open set. 


Example 1.16 


Let S to be the open interval S = (a,b) inR. Ifa € S,thena<2 <b. 
Hence, x — a and b — x are positive. Let r = min{x — a,b — x}. Then 
r>0,r<a2-—aandr < b—«. These imply thatta<x2—-r<a24+r<b. 
Hence, B(x,r) = (x —r,x+1r) C (a,b) = S. This shows that the interval 
(a, b) is an open set. 


¢ x = te 


Figure 1.10: The interval (a, b) is an open set. 


The following example justifies that an open ball is indeed an open set. 


Example 1.17 


Let S = B(xo, 7) be the open ball with center at xp and radius r > 0 in! 


Show that S is an open set. 


Chapter 1. Euclidean Spaces 


36 


Given x € S, d = ||x — xo] 
1.23 implies that the ball B(x, 71) is inside S. Hence, S is an open set. 


Example 1.18 


Solution 
<r. Letr; =r—d. Then7r,; > 0. Lemma 


As subsets of R”, @ and IR” are open sets. 


Example 1.19 


A one-point set S = {a} in] 


IR” cannot be open, for there is no r > 0 such 


that B(a,r) in contained in S. 


Let us look at some other examples of open sets. 


Definition 1.24 Open Rectangles 


A set of the form 


in IR”, which is a cartesian product of open bounded intervals, in called an 


open rectangle. 


Figure 


1.11: A rectangle in R?. 


Chapter 1. Euclidean Spaces ai 


Example 1.20 


et: — [[. b;) be an open rectangle in R”. Show that U is an open set. 
(= ll 


Solution 


Let x = (4%,...,%») be a point in U. Then for1 <i <n, 
r, = min{x; — a;,b; — x;} > 0 


and 
(a; = (PR. db3 Ie i) eG (ai, bi). 


Let r = min{ri,...,7,}. Then r > 0. We claim that B(x, r) is contained 
inU. 
If y € B(x,r), then ||y — x|| < r. This implies that 


ly; —as| < lly —xl|<r<nr; forall 1 <i<n. 
Hence, 
Yi © (ti — Ti, 0s + 74) C (ai, Bi) forall 1 <i <n. 


This proves that y € U, and thus, completes the proof that B(x,r) is 
contained in U. Therefore, U is an open set. 


Figure 1.12: An open rectangle is an open set. 


Chapter 1. Euclidean Spaces 38 


Next, we define closed sets. The definition is a straightforward generalization 
of the n = 1 case. 


Definition 1.25 Closed Sets 


Let S be a subset of R”. We say that S is closed in R” provided that if {aj} 
is a sequence of points in S that converges to the point a, the point a is also 
in S. 


Example 1.21 


As subsets of IR”, @ and R” are closed sets. Since J and R” are also open, a 


subset S' of IR” can be both open and closed. 


Example 1.22 


Let S = {a} be a one-point set in R”. A sequence {a;,} in S is just the 
constant sequence where a; = a for all k € Z*. Hence, it converges toa 
which is in S. Thus, a one-point set S is a closed set. 


In volume I, we have proved the following. 
Proposition 1.24 


Let J be intervals of the form (—oo, a], [a, 00) or [a,b]. Then J is a closed 
subset of R. 


Definition 1.26 Closed Rectangles 


A set of the form 


R= [Tlai.bi = [ay, by] x +++ x [an, bn] 


in IR”, which is a cartesian product of closed and bounded intervals, is called 
a closed rectangle. 


The following justifies that a closed rectangle is indeed a closed set. 


Chapter 1. Euclidean Spaces 39 


Example 1.23 


R= [Tla:.bi = [ay, by] x +++ x [an, bn] 


be a closed rectangle in IR”. Show that F is a closed set. 


Solution 
Let {a;,} be a sequence in R that converges to a point a. For each 1 <i < 
n, {7;(a;)} is a sequence in [a;, b;] that converges to 7;(a). Since [a;, bj] is 


a closed set in R, 7;(a) € [a;, b;]. Hence, a is in R. This proves that R is a 
closed set. 


It is not true that a set that is not open is closed. 
Example 1.24 


Show that an interval of the form J = (a, b] in R is neither open nor closed. 


Solution 
If J is open, since b is in J, there is an r > 0 such that (b —r,b+r) = 
B(b,r) C I. But then b+7r/2 is a point in (b—r, b+r) but notin J = (a, b], 
which gives a contradiction. Hence, J is not open. 
For k € Z", let 


Then {a;,} is a sequence in J that converges to a, but a is not in J. Hence, 


T is not closed. 


Thus, we have seen that a subset S of R” can be both open and closed, and it 
can also be neither open nor closed. 
Let us look at some other examples of closed sets. 


Chapter 1. Euclidean Spaces 40 


Definition 1.27 Closed Balls 


Given xp in R” and r > O, a closed ball of radius r with center at xo is a 


subset of IR” of the form 


CB(xo,r) = {x € R”|||x — xo|| <r}. 


It consists of all points of R” whose distance to the center xp is less than or 
equal to r. 


The following justifies that a closed ball is indeed a closed set. 


Example 1.25 


Given xp € R” and r > O, show that the closed ball 


CB(xo,r) = {x € R” | |x — xo|| < r} 
is a closed set. 


Solution 


Let {a,} be a sequence in C'B(xo, 1) that converges to the point a. Then 


lim |l/a, — al] = 0. 
k-00 


For each k € Z*, ||ay — xo|| <r. By triangle inequality, 
||a — Xol] < l]ax — Xol| + lax — all <r + |lax — al]. 
Taking the & — oo limit, we find that 
la — Xl| <r. 


Hence, a is in CB(xo, 7). This proves that C'B(xo, 1) is a closed set. 


The following theorem gives the relation between open and closed sets. 


Chapter 1. Euclidean Spaces 41 


Theorem 1.25 


Let S be a subset of R” and let A = R” \ S' be its complement in R”. Then 
S is open if and only if A is closed. 


Assume that S' is open. Let {a; } be a sequence in A that converges to the 
point a. We want to show that a is in A. Assume to the contrary that a 
is notin A. Then ais in S. Since S is open, there is an r > O such that 
B(a,r) is contained in S. Since the sequence {a;} converges to a, there is 
a positive integer K such that for all k > K, |la;, — al] < r. But then this 
implies that ax € B(a,r) C S. This contradicts to ax isin A = R" \ S. 
Hence, we must have a is in A, which proves that A is closed. 


Conversely, assume that A is closed. We want to show that S' is open. 
Assume to the contrary that S'is not open. Then there is a point a in S such 
that for every r > 0, B(a,r) is not contained in S. For every k € Z*, since 
B(a,1/k) is not contained in S, there is a point a; in B(a,1/k) such that 
a; is not in S. Thus, {a;} is a sequence in A and 


1 
la, — all < Ee 


This shows that {a;} converges to a. Since A is closed, a is in A, which 
contradicts to a is in S. Thus, we must have S is open. 


_—— 


, N 
e 
di 
7 NX 
* 7 > 
—_— ' 
Yc S \ 
4 
7 \ | 
é \ ] 
| e | | 
4 x 
e 4\\ / ) 
ees ait f 
_ / 
\ 
° y 
e° ~ d 
e @ NX y 
a a 


Figure 1.13: A sequence outside an open set cannot converge to a point in the 
open set. 


Chapter 1. Euclidean Spaces 42 


Next, we consider unions and intersections of sets. 
Theorem 1.26 


. Arbitrary union of open sets is open. Namely, if {U,|a € J} isa 


collection of open sets in R”, then their union U = U U,, 1s also an 


acd 
open set. 


. Finite intersections of open sets is open. Namely, if Vi,..., V, are open 
k 


sets in IR”, then their intersection V = () V; is also an open set. 
4—1 


To prove the first statement, let x be a point in U = U U,. Then there is 


acd 
an a € J such that x is in U,. Since U, is open, there is an r > 0 such that 


B(x,r) CU, CU. Hence, U is open. 
k 
For the second statement, let x be a point in V = @ V;. Then for each 
=1 
1 <i < k, xis in the open set V;. Hence, there is an r; > 0 such that 
Bix eV, Letra, wee te lnenton lt 7 fare and 
so B(x,r) C B(x,ri) C V;. Hence, B(x,r) C V. This proves that V is 
open. 


As an application of this theorem, let us show that any open interval in R is 
indeed an open set. 


Proposition 1.27 


Let J be an interval of the form (—oo, a), (a,0o) or (a,b). Then J is an 


open subset of R. 


Chapter 1. Euclidean Spaces 43 


We have shown in Example 1.16 that if / is an interval of the form (a, b), 


then J is an open subset of R. Now 
(a,00) = | J(a,a+k) 
1 


is a union of open sets. Hence, (a, 00) is open. In the same way, one can 


show that an interval of the form (—oo, a) is open. 


The next example shows that arbitrary intersections of open sets is not necessary 


open. 


in 


Example 1.26 


For k € Z*, let U;, be the open set in R given by 


1 1 
Ope ee a 
k ( ee :) 


Notice that the set 


t= =O 


k=1 


is a one-point set. Hence, it is not open in R. 


De Morgan’s law in set theory says that if {U, |a € J} is a collection of sets 


IR”, then 


R"\ J Ua = (| (R"\ Ua), 


acd acd 
R”\ ()U.= (J (R"\ U,). 
acd acd 


Thus, we obtain the counterpart of Theorem 1.26 for closed sets. 


Chapter 1. Euclidean Spaces 44 


Theorem 1.28 


. Arbitrary intersection of closed sets is closed. Namely, if {A,|a € J} 


is acollection of closed sets in R”, then their intersection A = al Ag is 


acd 
also a closed set. 


. Finite union of closed sets is closed. Namely, if C),...,C are closed 
k 


sets in IR”, then their union C’ = g) C;, is also a closed set. 
4=1 


We prove the first statement. The proof of the second statement is similar. 
Given that { A, | a € J} is acollection of closed sets in R", for each a € J, 
let U, = R" \ Aq. Then {U,| a € J} is a collection of open sets in R”. 
By Theorem 1.26, the set U U. is open. By Theorem 1.25, R” \ U U,, is 


aed aed 
closed. By De Morgan’s law, 


Ree Re Ua (eA 


acd acd aed 


This proves that () Aq is a closed set. 


acd 


The following example says that any finite point set is a closed set. 


Example 1.27 


k 


Let S = {x,...,x,} be a finite point set in R”. Then S = Oi isa 


finite union of one-point sets. Since one-point set is closed, S' i is rsh 


Chapter 1. Euclidean Spaces 45 


Exercises 1.3 


Question 1 


Let A be the subset of R? given by 

A la) oe Ong = Ot 
Show that A is an open set. 
Question 2 
Let A be the subset of R? given by 

A= {(a,y)|v 2 0,y 2 O}. 
Show that A is a closed set. 


Question 3 


Let A be the subset of R? given by 


A> 1a) |e 0.y 20: 


Is A open? Is A closed? Justify your answers. 


Question 4 


Let C' and U be subsets of R". Assume that C' is closed and U is open, 
show that U \ C is open and C'\ U is closed. 


Question 5 


Let A be a subset of R”, and let B = A+ u be the translate of A by the 


vector u. 
(a) Show that A is open if and only if B is open. 


(b) Show that A is closed if and only if B is closed. 


Chapter 1. Euclidean Spaces 46 


1.4 Interior, Exterior, Boundary and Closure 


First, we introduce the interior of a set. 


Definition 1.28 Interior 


Let S be a subset of IR”. We say that x € R” is an interior point of S if 


there exists r > 0 such that B(x,r) C S. The interior of S, denoted by 
int S, is defined to be the collection of all the interior points of S. 


= 


Figure 1.14: The interior point of a set. 


The following gives a characterization of the interior of a set. 


Theorem 1.29 


Let S be a subset of R”. Then we have the followings. 
1. int S is a subset of S. 

2. int S' is an open set. 

3. Sis an open set if and only if S = int S. 


4. If U is an open set that is contained in S, then U C int S. 


These imply that int S' is the largest open set that is contained in S. 


Chapter 1. Euclidean Spaces 47 


Let x be a point in int S. By definition, there exists r > O such that 
B(x,r) C S. Since x € B(x,r) and B(x,r) C S, x is a point in S. 
Since we have shown that every point in int S is in S, int S is a subset of S. 
If y € B(x,r), Lemma 1.23 says that there is an r’ > O such that 
B(y,r’) C B(x,r) C S. Hence, y is also in int S. This proves that B(x, r) 
is contained in int S. Since we have shown that for any x € int S, there 
is anr > 0 such that B(x, 1) is contained in int S, this shows that int S' is 
open. 

If S = int S, S is open. Conversely, if S is open, for every x in S, there 
is anr > 0 such that B(x,r) C S. Then x is in int S. Hence, S C int S. 
Since we have shown that int S C S' is always true, we conclude that if S 
is open, S = int S. 

If U is a subset of S and U is open, for every x in U, there is an r > 0 such 
that B(x,r) C U. But then B(x,r) C S. This shows that x is in int S. 
Since every point of U is in int S, this proves that U Cc int S. 


Example 1.28 


Find the interior of each of the following subsets of R. 


(a) A= (a,b) (b) B= (a, | 


(c) C= [a,)] (d) Q 


Solution 


(a) Since A is an open set, int A = A = (a,b). 


(b) Since A is an open set that is contained in B, A = (a,b) is contained 
in int B. Since intB C B, we only left to determine whether 6 is in 
int B. The same argument as given in Example 1.24 shows that b is not 
an interior point of B. Hence, int B = A = (a,)). 


Chapter 1. Euclidean Spaces 48 


(c) Similar arguments as given in (b) show that A C int C, and both a and 
b are not interior points of C. Hence, intC = A = (a,b). 


(d) For any x € Rand anyr > 0, B(z,r) = («© —r,x +1) contains an 


irrational number. Hence, B(x,7r) is not contained in Q. This shows 
that Q does not have interior points. Hence, int Q = @. 


Definition 1.29 Neighbourhoods 


Let x be a point in R” and let U be a subset of IR”. We say that U is a 
neighbourhood of x if U is an open set that contains x. 


Notice that this definition is slightly different from the one we use in volume I 
for the n = 1 case. 


Neighbourhoods 
By definition, if U is a neighbourhood of x, then x is an interior point of U, 
and there is an r > 0 such that B(x,r) CU. 
Example 1.29 
Consider the point x = (1, 2) and the sets 


v= { (a1, %2) | ej + 25 0, 


V = 1G) Op = 


R?. The sets U and V are neighbourhoods of x. 


Next, we introduce the exterior and boundary of a set. 


Definition 1.30 Exterior 


Let S be a subset of IR”. We say that x € R” is an exterior point of S if 


there exists r > 0 such that B(x,r) C R” \ S. The exterior of S, denoted 
by ext S, is defined to be the collection of all the exterior points of S. 


Chapter 1. Euclidean Spaces 49 


Figure 1.15: The sets U and V are neighbourhoods of the point x. 


Definition 1.31 Boundary 


Let S' be a subset of IR”. We say that x € R” is a boundary point of S if for 
every r > 0, the ball B(x, 1) intersects both S and R” \ S. The boundary of 
S, denoted by bd S or OS, is defined to be the collection of all the boundary 


points of S. 


Figure 1.16: P is an interior point, @ is an exterior point, F’ is a boundary point. 


Chapter 1. Euclidean Spaces 50 


Theorem 1.30 
Let S be a subset of IR”. We have the followings. 


(a) ext (S) = int (R” \ S). 


(b) bd (5) = bd (R” \ S). 


(c) int S, ext S and bd S are mutually disjoint sets. 


(d) R°=intS U extS U bdS. 


(a) and (b) are obvious from definitions. 


For parts (c) and (d), we notice that for a point x € IR”, exactly one of the 
following three statements holds. 


(i) There exists r > 0 such that B(x,r) C S. 


(ii) There exists r > 0 such that B(x,r) C R"\S. 


(iii) For every r > 0, B(x, 1) intersects both S and R” \ S. 


Thus, int S, ext S and bd S are mutually disjoint sets, and their union is | 


Example 1.30 


Find the exterior and boundary of each of the following subsets of R. 


(a) A= (a,b) (b) B= (a, } 


(c) C= [a,)] (d) Q 


Solution 
We have seen in Example 1.28 that 


int A = int B = intC = (a,)). 


Chapter |. Euclidean Spaces 51 


For any r > 0, the ball B(a,r) = (a — r,a +1) contains a point less than 
a, and a point larger than a. Hence, a is a boundary point of the sets A, B 
and C’. Similarly, b is a boundary point of the sets A, B and C. 

For every point x which satisfies x < a, letr = a— x. Then r > 0. Since 
x+r =a, the ball B(x,r) = (x—r,x+r) is contained in (—oo, a). Hence, 
x is an exterior point of the sets A, B and C. Similarly every point x such 
that x > b is an exterior point of the sets A, B and C. 


Since the interior, exterior and boundary of a set in R are three mutually 


disjoint sets whose union is R, we conclude that 


bd A = bd B = bd C = {a,b}, 
exit A — ext — ext —| 2, a)1) (0,60), 


For every x € R and every r > 0, the ball B(x, r) = (w—r,x+1r) contains 
a point in Q and a point not in Q. Therefore, x is a boundary point of Q. 
This shows that bd Q = R, and thus, ext Q = @. 


Example 1.31 


Let A = B(xo,1r), where xo is a point in R”, and r is a positive number. 
Find the interior, exterior and boundary of A. 


Solution 
We have shown that A is open. Hence, int A = A. Let 


U = {x € R"| |x — xo] > r}, C = {x € R" | ||x — xo|| =r}. 


Notice that A, U and C are mutually disjoint sets whose union is R”. 
i xismU, d= ||x—xg|| > 7. Lety —d—1. Thenr’ > 0 iy © Bix,x), 
then ||y — x|| <r’. It follows that 


lly — xol| = |x — xoll — lly -— xl] > d-r’ =r. 


This proves that y € U. Hence, Ba(x,r’) C U C R"\ A, which shows that 
x is an exterior point of A. Thus, U C ext A. 


Chapter |. Euclidean Spaces a2 


1 
Now if x € C, ||x — xo|| = r. For every r’ > 0, leta = 5 min{r’/r, 1p. 


1 / 
Thena < 5 anda < = Consider the point 
iF 


Vv =x —a(x — Xp). 


Notice that 


/ 
lv — x| =ar so <r. 


Thus, v is in B(x, r’). On the other hand, 
|v — xo|| = (1 -a)r <r. 


Thus, v is in A. This shows that B(x, r’) intersects A. Since x is in B(x, r’) 


but not in A, we find that B(x, r’) intersects IR” \ A. Hence, x is a boundary 
point of A. This shows that CC bd A. 

Since int A, ext A and bd A are mutually disjoint sets, we conclude that 
intA = A,extA =U andbd A =C. 


Now we introduce the closure of a set. 


Definition 1.32 Closure 


Let S be a subset of IR”. The closure of S, denoted by S, is defined as 


S=intS UbdS. 


Example 1.32 


Example |.31 shows that the closure of the open ball B(x, 7) is the closed 
ball CB(xo,r). 


Example 1.33 


Consider the sets A = (a,b), B = (a,b] and C = [a,b] in Example 1.28 
and Example 1.30. We have shown that int A = int B = intC = (a,b), and 
bd A = bd B = bdC = {a,b}. Therefore, A = B = C = [a,b]. 


Chapter 1. Euclidean Spaces 53 


Since R” is a disjoint union of int S, bd S' and ext S, we obtain the following 


immediately from the definition. 


Theorem 1.31 


Let S be a subset of IR”. Then S and ext S are complement of each other in 
R”. 


The following theorem gives a characterization of the closure of a set. 


Theorem 1.32 


Let S bea subset of R”, and let x be a point in R”. The following statements 


are equivalent. 
(a) xe 8. 
(b) For every r > 0, B(x,r) intersects S. 


(c) There is a sequence {x;,} in S that converges to x. 


If x is in S, x is not in int (R” \ S). Thus, for every r > 0, B(x,7r) is not 
contained in R” \ S. Then it must intersect S. This proves (a) implies (b). 
If (b) holds, for every k € Z*, take r = 1/k. The ball B(x, 1/k) intersects 


S' at some point x;. This gives a sequence {x;} satisfying 


1 
xz — x|| < —. 


k 


Thus, {x;,} is a sequences in S that converges to x. This proves (b) implies 
(c). 

If (c) holds, for every r > 0, there is a positive integer K such that for all 
k > K, ||x~, —x|| < r, and thus x, € B(x,7r). This shows that B(x, 1r) is 
not contained in R” \ S. Hence, x ¢ ext S, and thus we must have x € S. 


This proves (c) implies (a). 


The following theorem gives further properties of the closure of a set. 


Chapter 1. Euclidean Spaces 54 


Theorem 1.33 
Let S be a subset of R”. 


1. Sis aclosed set that contains S. 
2. S is closed if and only if S = S. 


3. If Cis aclosed subset of R” and S C C, then Sc C. 


These imply that S' is the smallest closed set that contains 9. 


These statements are counterparts of the statements in Theorem 1.29. 


Since ext S = int (IR” \ S), and the interior of a set is open, ext S is open. 


Since S = R” \ ext S, S is aclosed set. Since ext 5 C R” \ S, we find that 


eR Wextc ao. 


If S = S, then S must be closed since S' is closed. Conversely, if 5 is 
closed, R” \ S is open, and so ext S = int(R” \ S) = R” \ S. It follows 
(hat — IRs exts — 0, 

If C' is a closed set that contains S, then R” \ C' is an open set that is 
contained in R” \ S. Thus, R” \ C' C int (R” \ S) = ext S. This shows that 
COR \extS = 8. 


Corollary 1.34 


If S be a subset of R”, S = SUbdS. 


Since intS C S, S = intS UbdS C S UbdS. Since S and bd S are both 
subsets of S, 5 Ubd S C S. This proves that S = S Ubd S. 


Chapter 1. Euclidean Spaces 32 


Example 1.34 


Let U be the open rectangle U = [[@. b;) in R”. Show that the closure 


I 


of U is the closed rectangle R = ] [la:: bj]. 


cI 


Solution 


Since R is a closed set that contains U, U c R. 
If x = (1,...,%») is a point in R, then x; € |[a;,b;] foreach 1 <i < n. 


Since [a;, b;] is the closure of (a;, b;) in R, there is a sequence {x;,,}?2, in 
(a;, b;) that converges to x;. For k € Z*, let 


Xk = (Gites 2-0-0 eeu 


Then {x;,} is a sequence in U that converges to x. This shows that x € U, 
and thus completes the proof that U = R. 


The proof of the following theorem shows the usefulness of the characterization 
of int S as the largest open set that is contained in S, and S is the smallest closed 
set that contains S. 

Theorem 1.35 

If A and B are subsets of IR” such that A C B, then 


(a) int A C int B; and 


(b) ACB. 


Since int A is an open set that is contained in A, it is an open set that is 
contained in B. By the fourth statement in Theorem 1.29, int A C int B. 


Since B is a closed set that contains B, it is a closed set that contains A. By 
the third statement in Theorem 1.33, A Cc B. 


Notice that as subsets of R, (a,b) C (a,b] C [a,b]. We have shown that 


Chapter 1. Euclidean Spaces 56 


(a, b) = (a, b] = [a, ]. In general, we have the following. 


Theorem 1.36 


If A and B are subsets of R” such that A C B C A, then A = B. 


By Theorem 1.35, A C B implies that A C B, while B C A implies that 
B is contained in A = A. Thus, we have 


Ne Bart 


which proves that B = A. 


Example 1.35 


In general, if S is a subset of IR”, it is not necessary true that int S = int S, 
even when S' is an open set. For example, take S = (—1,0) U (0,1) in 
R. Then S is an open set and S = [—1,1]. Notice that intS = 9 = 
(—1,0) U(0, 1), but int S = (—1, 1). 


Chapter 1. Euclidean Spaces af 


Exercises 1.4 


Question 1 


Let S be a subset of IR”. Show that bd S is a closed set. 


Question 2 


Let A be the subset of R? given by 
A={(2,y)| x <0,y 2 0}. 


Find the interior, exterior, boundary and closure of A. 


Question 3 


Let Xp be a point in R”, and let r be a positive number. Consider the subset 
of IR” given by 


A= {x € R”|0 < ||x — xo|| <r}. 
Find the interior, exterior, boundary and closure of A. 
Question 4 
Let A be the subset of R? given by 
A={(z,y)|1< 2 <3,-2<y < 5}U {(0,0), (2, —3)}. 


Find the interior, exterior, boundary and closure of A. 


Question 5 


Let S be a subset of IR”. Show that 


bdS=S 1 R\S. 


Chapter 1. Euclidean Spaces 58 


Question 6 


Let S be a subset of IR”. Show that bd S Cc bd.S. Give an example where 
bas == bus. 


Question 7 


Let S be a subset of R”. 


(a) Show that S'is open if and only if S does not contain any of its boundary 
points. 


(b) Show that S' is closed if and only if S contains all its boundary points. 


Question 8 


Let S be a subset of IR”, and let x be a point in R”. 


(a) Show that x is an interior point of S if and only if there is a 
neighbourhood of x that is contained in S. 


(b) Show that x € S if and only if every neighbourhood of x intersects 9. 


(c) Show that x is a boundary point of S' if and only if every neighbourhood 
of x contains a point in S and a point not in S. 


Question 9 


Let S be a subset of R”, and let x = (21,...,2,) be a point in the interior 
of S. 


(a) Show that there is an r; > 0 such that CB(x,r,) CS. 


(b) Show that there is an ry > O such that [[@: —12,0,+72) CS. 
4=1 


(c) Show that there is an r3 > 0 such that [[lz: — 73,0; +73] CS. 
oe 


Chapter 1. Euclidean Spaces 59 


1.5 Limit Points and Isolated Points 


In this section, we generalize the concepts of limit points and isolated points to 
subsets of R”. 


Definition 1.33 Limit Points 


Let S be a subset of IR”. A point x in R” is a limit point of S provided that 


there is a sequence {x;,} in S \ {x} that converges to x. The set of limit 
points of S' is denoted by S’. 


By Theorem 1.32, we obtain the following immediately. 


Theorem 1.37 


Let S be a subset of R”, and let x be a point in R”. The following are 


equivalent. 
(a) x is a limit point of S. 
(b) xisin S \ {x}. 


(c) For every r > 0, B(x,1r) intersects S at a point other than x. 


Corollary 1.38 


If S is a subset of R”, then S’ c S. 


Ifx € S’,x € S\ {x}. Since 9 \ {x} C S, we have S\ {x} c S. 
Therefore, x € S. 


The following theorem says that the closure of a set is the union of the set with 
all its limit points. 


Theorem 1.39 


If S is a subset of R", then S = SUS". 


Chapter |. Euclidean Spaces 60 


By Corollary 1.38, S’ C S. Since we also have S C 3S, we find that 
Ses ee Se 
Conversely, if x € S,, then by Theorem 1.32, there is a sequence {x;,} in S 


that converges to x. If x is not in S, then the sequence {x;} is in S' \ {x}. 
In this case, x is a limit point of S. This shows that S \ S C S’, and hence, 
Sesws: 


In the proof above, we have shown the following. 


Corollary 1.40 


Let S be a subset of R”. Every point in S$ that is not in S is a limit point of 
S. Namely, 
Ss es. 


Now we introduce the definition of isolated points. 


Definition 1.34 Isolated Points 


Let S be a subset of R”. A point x in R” is an isolated point of S if 
(a) x isin S; 
(b) x is not a limit point of S. 


Remark 1.1 


By definition, a point x in S is either an isolated point of S' or a limit point 
of S. 


Theorem 1.37 gives the following immediately. 


Theorem 1.41 


Let S be a subset of R” and let x be a point in S. Then x is an isolated 


point of S if and only if there is an r > 0 such that the ball B(x,1r) does 
not contain other points of S except the point x. 


Chapter |. Euclidean Spaces 61 


Example 1.36 


Find the set of limit points and isolated points of the set A = Z? as a subset 
of R?. 


Solution 
If {x;,} is a sequence in A that converges to a point x, then there is a positive 
integer KK such that for alll > k > Kk, 


\|xz = x,|| < |i, 


This implies that x, = xx forall k > kK. Hence, x = xx € A. This shows 
that A is closed. Hence, A = A. Therefore, A’ C A. 
For every x = (k,l) € Z?, B(x, 1) intersects A only at the point x itself. 


Hence, x is an isolated point of A. This shows that every point of A is an 
isolated point. Since A’ C A, we must have A’ = 0) 


Figure 1.17: The set Z? does not have limit points. 


Let us prove the following useful fact. 


Theorem 1.42 


If S is a subset of R”, every interior point of S'is a limit point of S. 


Chapter |. Euclidean Spaces 62 


If x is an interior point of S, there exists r9 > 0 such that B(x,ro) C S. 


' tre ; 
Given r > 0, let r’ = A min{r,ro}. Then r’ > 0. Since r’ < r andr’ < ro, 


the point 


<— xe, 


is in B(x,r) and S. Obviously, x’ # x. Therefore, for every r > 0, B(x, r) 
intersects S at a point other than x. This proves that x is a limit point of S. 


Since S C intS UbdS, and int S and bdS are disjoint, we deduce the 
following. 


Corollary 1.43 


Let S be a subset of R”. An isolated point of S must be a boundary point. 


Since every point in an open set S is an interior point of S, we obtain the 


following. 


Corollary 1.44 


If S is an open subset of IR”, every point of S is a limit point. Namely, 
SGIo, 


Example 1.37 


If J is an interval of the form (a, b), (a, b], [a, b) or [a, 6] in R, then bd J = 
{a, b}. Itis easy to check that a and b are not isolated points of J. Hence, I 
has no isolated points. Since J = J UI’ and I C I’, we find that J’ = J = 
[a, 6). 


In fact, we can prove a general theorem. 


Theorem 1.45 
Let A and B be subsets of R” such that A is open and A C B C A. Then 


B' = A. In particular, the set of limit points of A is A. 


Chapter 1. Euclidean Spaces 63 


By Theorem 1.36, A = B. Since A is open, A C A’. Since A = AU A’, 
we find that A = A’. 

In the exercises, one is asked to show that A C B implies A’ Cc B’. 
Therefore, A = A’ Cc B’ Cc B. Since A = B, we must have B’ = B = A. 


Example 1.38 


Let A be the subset of R? given by 


Av al, Wl (= 252) — ay) | a SF 


Chapter 1. Euclidean Spaces 64 


Exercises 1.5 
Question 1 


Let A and B be subsets of R” such that A Cc B. Show that A’ c B’. 


Question 2 


Let xp be a point in R” and let r be a positive number. Find the set of limit 
points of the open ball B(xo,7r). 


Question 3 
Let A be the subset of R? given by 
A={(2,y)|e<0,y 2 0}. 


Find the set of limit points of A. 


Question 4 


Let Xo be a point in R”, and let r is a positive number. Consider the subset 
of IR” given by 


A= {x € R”|0 < ||x — xo|| <r}. 
(a) Find the set of limit points of A. 


(b) Find the set of isolated points of the set S = R” \ A. 


Question 5 


Let A be the subset of R? given by 


A= {(x,y)|1 <x <3,-2<y <5}U {(0,0), (2, -3)}. 


Determine the set of isolated points and the set of limit points of A. 


Chapter 1. Euclidean Spaces 65 


Question 6 
Let A = Q? as a subset of R?. 
(a) Find the interior, exterior, boundary and closure of A. 


(b) Determine the set of isolated points and the set of limit points of A. 


Question 7 


Let S be a subset of IR”. Show that S is closed if and only if it contains all 


its limit points. 


Question 8 


Let S be a subset of IR”, and let x be a point in R”. Show that x is a limit 
point of S if and only if every neighbourhood of x intersects S' at a point 
other than itself. 


Question 9 


Let x, Xo, ..., X, be points in R” and let A = R” \ {x1,X2,...,x,}. Find 


the set of limit points of A. 


Chapter 2. Limits of Multivariable Functions and Continuity 66 


Chapter 2 


Limits of Multivariable Functions and Continuity 


We are interested in functions F : — R” that are defined on subsets D of R”, 


taking values in R”. When n > 2, these are called multivariable functions. When 


m > 2, they are called vector-valued functions. When m = 1, we usually write 


the function as f: D > R. 


2.1 Multivariable Functions 


In this section, let us define some special classes of multivariable functions. 


2.1.1 Polynomials and Rational Functions 
A special class of functions is the set of polynomials in n variables. 
Definition 2.1 Polynomials 


Let k = (k,,...,k,) be an n-tuple of nonnegative integers. Associated 


to this n-tuple k, there is a monomial py : R" — R of degree |k| = 
ky +-+++ kp of the form p(x) = xf --- ak, 


A polynomial in 7 variables is a function p : R” — R that is a finite linear 


combination of monomials in n variables. It takes the form 


p(x) = De Ck; Pk; (x), 


where k;,ko,...,k,, are distinct n-tuples of nonnegative integers, and 
Ck) Cka)-++ Ck, are nonzero real numbers. The degree of the polynomial 
p(x) is max{|k], [ka], 595) [kin }. 


Chapter 2. Limits of Multivariable Functions and Continuity 67 


Example 2.1 


The following are examples of polynomials in three variables. 


(a) p(t, 22,23) = 27 +45 +23 


(Disp (Gig. ea) — Ax? ro = 3023 + £1 2ov3 


Example 2.2 


The function f : | 


is not a polynomial. 


When the domain of a function is not specified, we always assume that the 
domain is the largest set on which the function can be defined. 


Definition 2.2 Rational Functions 


A rational function f : ® — Ris the quotient of two polynomials p : | 
Rand q: R” > R. Namely, 


Its domain D is the set 


R” | q(x) # O}. 


Example 2.3 


The function 


21% + 3x? 


eee £2) a 


is arational function defined on the set 


v1 — v2 


= { (21, 29) ER’ |2,F tye 


Chapter 2. Limits of Multivariable Functions and Continuity 68 


2.1.2 Component Functions of a Mapping 


If the codomain IR” of the function F : 9 — R™ has dimension m > 2, we 
usually call the function a mapping. In this case, it would be good to consider the 
component functions. 


For 1 < 7 < m, the projection function 7; : IR’ — R is the function 


T; (x1; pare (ve) = Lj. 


Definition 2.3 Component Functions 


Let F : 9 — R” be a function defined on D C R”. For 1 < 7 < m, the 


j" component function of F is the function F; : D — R defined as 


(7;°0F): DOR. 


For each x € 9, 
[leg = (020 eee (6-0 


Example 2.4 


For the function F : R® — R*, F(x) = —3x, the component functions are 


F, (a1, £2, £3) = —371, F(21, £2, £3) = —32o, F3(x1, £2, £3) = —373. 


For convenience, we also define the notion of polynomial mappings. 


Definition 2.4 Polynomial Mappings 


We call a function F : R” — R” a polynomial mapping if each of its 


components Ff’; : R"” — R, 1 < 7 < mM, is a polynomial function. The 
degree of the polynomial mapping F is the maximum of the degrees of the 
polynomials F), Fo,..., Fim. 


Example 2.5 


The mapping F : R® > R?, 


F(a, y, z) = (2’y + 32z, 8yz* — 7x) 


is a polynomial mapping of degree 4. 


Chapter 2. Limits of Multivariable Functions and Continuity 69 


2.1.3 Invertible Mappings 


The invertibility of a function F : ® — R" is defined in the following way. 


Definition 2.5 Inverse Functions 


Let D be a subset of R”, and let F : 9 — R” be a function defined on 
D. We say that F is invertible if F is one-to-one. In this case, the inverse 
function F~! : F(D) — D is defined so that for each y € F(D), 


a (see if and only if Gai — oy. 


Example 2.6 


Let D = {(x,y)|v >0,y > 0} and let F : D — R? be the function 
defined as 


F(z, y) =(@-y, 2+). 


Show that F is invertible and find its inverse. 


Solution 
Letu =x —yandv=2z+y. Then 


ur 
serra 


This shows that for any (u,v) € R?, there is at most one pair of (x, y) such 
that F(z, y) = (u,v). Thus, F is one-to-one, and hence, it is invertible. 
Observe that 


TD) i 


The inverse mapping is given by F~' : F(D) > | 


Chapter 2. Limits of Multivariable Functions and Continuity 70 


2.1.4 Linear Transformations 


Another special class of functions consists of linear transformations. A function 


T : R” > R'is a linear transformation if for any x;,..., x, in IR”, and for any 


Cy,...,cr in R, 
T(cyx1 +--+ +%X~) = CT (x1) +--+ + eT (xx). 


Linear transformations are closely related to matrices. 
An m X n matrix A is an array with m rows and n columns of real numbers. 
It has the form 


Qi, a12 Qin 

a21 422 a2n 
A = [a,j] 

Ami GOm2 °*° Amn 


If A = |{a;;| and B = [b,;| are m x n matrices, a and ( are real numbers, aA+ (6B 
is defined to be the m x n matrix C = aA + 6B = [c;;| with 


Cy =i + Pby;: 


If A = [aj] isa m x k matrix, B = [bj] is a k x n matrix, the product AB is 
defined to be the m x n matrix C = AB = [c;;], where 


k 
Cy = y aitdy;- 
1 


It is easy to verify that matrix multiplications are associative. 


Given x = (21,...,2n) in R”, we identify it with the column vector 
Baal 
ao) 
c= : 
Xn 


which is ann xX 1 matrix. If A is an m X n matrix, and x is a vector in R”, then 


y = Axis the vector in R™ given by 


Qi, 442 *** An Ly Q41X1 1 A1Q%Q T+ FT Aintn 


Q21 422 °*** Gan XQ Q21%1 1 AQ9%2 +++ + Aantn 


Qmi Gm2 ‘** G&mn In Ami) a Am2t2 ae ae ie Amn n 


Chapter 2. Limits of Multivariable Functions and Continuity 71 


The following is a standard result in linear algebra. 


Theorem 2.1 


A function T : R” — R” is a linear transformation if and only if there 


exists an’m x n matrix A = [a;;] such that 
T(x) = Ax. 


In this case, A is called the matrix associated to the linear transformation 
T:R’ 5 R™., 


Sketch of Proof 
It is easy to verify that the mapping T : R” > R™, T(x) = Axis a linear 


transformation if A is an m xX n matrix. 
Conversely, if T : IR” — R” is a linear transformation, then for any x € 
IR 


Tix) = T(x1e1+22e2+- : -+2n€n) = x17 (e1)+22T(e)+- -+2,T (en). 


Define the vectors aj, ag, ..., a, in R” by 
a; = T(e1), ag = T(eo), ..., an = T(en). 


Let A be the m x n matrix with column vectors aj, ag, ..., a,. Namely, 


A=|a|a|-- | a, 


Then we have T(x) = Ax. 


Example 2.7 


R? be the function defined as 


F(a, 9) = (@ = ¥,0 +). 


Then F is a linear transformation with matrix A = 


Chapter 2. Limits of Multivariable Functions and Continuity TZ 


For the linear transformation T : R” — R™, T(x) = Ax, the component 


functions are 


Ti (x) = @11%1 + Gy2%_ +--+ AinEn, 


To(X) = da1%1 + Aoo%o +--+ + GonEn, 


Tm(X) = Qmi%1 + Ome%2 +--+ + QmnEn- 


Each of them is a polynomial of degree at most one. Thus, a linear transformation 
is a polynomial mapping of degree at most one. It is easy to deduce the following. 


Corollary 2.2 


A mapping T : R” — R" is a linear transformation if and only if each 


component function is a linear transformation. 


The followings are some standard results about linear transformations. 


Theorem 2.3 


IfS : R® > R”™ and T : R" — R" are linear transformations with 


matrices A and B respectively, then for any real numbers a@ and 8, aS + 


BT : R” > R" 1s a linear transformation with matrix aA + 6B. 


Theorem 2.4 


IfS:R”" > R” and T : R™ — R* are linear transformations with matrices 


A and B, then ToS: R” > R* is a linear transformation with matrix BA. 


Sketch of Proof 
This follows from 


(T 0 8)(x) = T(S(x)) = B(Ax) = (BA)x. 


In the particular case when m = n, we have the following. 


Chapter 2. Limits of Multivariable Functions and Continuity a 


Theorem 2.5 


Let T : R” > R” be a linear transformation represented by the matrix A. 


The following are equivalent. 


(a) The mapping T : R” — R” is one-to-one. 


(b) The mapping 'T : R” — R” is onto. 
(c) The matrix A is invertible. 


(d) det A 4 0. 


In other words, if the linear transformation T : R"” — R” is one-to-one or 
onto, then it is bijective. In this case, the linear transformation is invertible, and 


we can define the inverse function T~! : R” > R”. 


Theorem 2.6 


Let T : R” — R?” be an invertible linear transformation represented by 


the matrix A. Then the inverse mapping T~! : R” — R” is also a linear 
transformation and 
Amel x) 2p oe, 


Example 2.8 


R? be the linear transformation 


T(z,y) =(a#-y, e+ y). 


1 -l 
The matrix associated with T is A = on | . Since det A = 2 £0, Tis 


invertible. Since A~! = 


1 1 
= , we have 
Z| 1 


Chapter 2. Limits of Multivariable Functions and Continuity 74 


2.1.5 Quadratic Forms 


Given an m X n matrix A = [a,,], its transpose is the n x m matrix A? = [b;;], 
where 
=a forall <i<n,l<jg<m. 


Ann X n matrix A is symmetric if 
ASA: 
Ann X n matrix P is orthogonal if 


P’P= PPT =I. 


If the column vectors of P are vj, Vo, ..., Vn, So that 

P= |vi|vo|-- | val, (2.1) 
then P is orthogonal if and only if {v,,...,V,,} is an orthonormal set of vectors 
in R”. 


If A is ann X nm symmetric matrix, its characteristic polynomial 


p(A) = det(Al, — A) 


is a monic polynomial of degree n with n real roots A, A2,..., An, Counting 
with multiplicities. These roots are called the eigenvalues of A. There is an 
orthonormal set of vectors {v1,...,V,,} in R” such that 

Av; = NV; for all 1 < 1 < n. (2.2) 


Let D be the diagonal matrix 


Ai 0 0 
O 2 0 

D = : 7 ’ (2.3) 
0 O An 


and let P be the orthogonal matrix (2.1). Then (2.2) is equivalent to AP = PD, 
or equivalently, 
A=PDP?=PDP™. 


Chapter 2. Limits of Multivariable Functions and Continuity 75 


This is known as the orthogonal diagonalization of the real symmetric matrix A. 


A quadratic form in R” is a polynomial function Q : R” > R of the form 


Q(x) = S- Cig Vit. 


1<i<j<n 


Ann X n symmetric matrix A = |a;;| defines a quadratic form Q 4 : R” — R by 


Q a(x) = x! Ax = s 3 AjjLiX;. 


i=1 j=l 


Example 2.9 


1 -2 
The symmetric matrix A = a | defines the quadratic form 


Qa(z,y) = x7 — Aay + By”. 


Conversely, given a quadratic form 


Q(x) = > CijLjiX 5, 


1<i<j<n 


then Q = Qa, where the entries of A = [a;,] are 


C. ie ef 
dig = 4 Gy /2, ifz <4, 
71/2, i o>. 
Thus, there is a one-to-one correspondence between quadratic forms and symmetric 
matrices. 
If A = PDP? is an orthogonal diagonalization of A, under the change of 


variables 
y = P’x, orequivalently, x= Py 


we find that 
Qa=y" Dy = Ay? +--+ Any?. (2.4) 


A consequence of (2.4) is the following. 


Chapter 2. Limits of Multivariable Functions and Continuity 76 


Theorem 2.7 


Let A be an n xX n symmetric matrix, and let Q4(x) = x7 Ax be the 
associated quadratic form. Let A,,A2,...,A, be the eigenvalues of A. 
Assume that 

Rae 


Then for any x € R”, 


An lll? < Qa(x) < Arffxll?. 


Sketch of Proof 
Given x € R”, let y = P? x. Then 


Iv? =y7y =x" PP*x = x*x = |x|’. 


By (2.4), 
Qa(x) = Ary? +--+ + Any? 


Since A, << --- < A» < Ay, we find that 
Gee. oP See) = ar ee 


The assertion follows. 


At the end of this section, let us recall the classification of quadratic forms. 


Definiteness of Symmetric Matrices 


Given an n x n symmetric matrix A = [a;;], letQ4 :R" >] 


Qa(x) = x? Ax = Se > Aj jU iL j 


i=1 j=l 


be the associated quadratic form. 


Chapter 2. Limits of Multivariable Functions and Continuity TT 


. We say that the matrix A is positive definite, or the quadratic form Q 4 
is positive definite, if Q4(x) > 0 for all x 4 0 in R”. 


. We say that the matrix A is negative definite, or the quadratic form Q 4 
is negative definite, if Q.4(x) < 0 for allx #0 in R”. 


. We say that the matrix A is indefinite, or the quadratic form Q, is 


indefinite, if there exist u and v in R” such that Q4(u) > O and 
Qa(v) <0. 


. We say that the matrix A is positive semi-definite, or the quadratic form 


Qa is positive semi-definite, if Q.4(x) > 0 for all x in R”. 


. We say that the matrix A is negative semi-definite, or the quadratic form 


Q 4 is negative semi-definite, if Q.4(x) < 0 for all x in R”. 


Obviously, a symmetric matrix A is negative definite if and only if —A is 
positive definite. 

The following is a standard result in linear algebra, which can be deduced 
from (2.4). 


Theorem 2.8 


Let A be an n x n symmetric matrix, and let Q4(x) = x7 Ax be the 
associated quadratic form. Let {A1,...,An} be the set of eigenvalues of 
A, repeated with multiplicities. 


(a) Qa, is positive definite if and only if A; > 0 foralll <i<n. 


(b) Q, is negative definite if and only if A; < Oforalll <i<n. 


(c) G4, 1s indefinite if there exist 7 and j so that A; > 0 and A; < 0. 


(d) Q, is positive semi-definite if and only if A; > 0 for all 1 <i <n. 


(e) Qa is negative semi-definite if and only if A; < 0 foralll <i<n. 


From Theorem 2.7 and Theorem 2.8, we obtain the following. 


Chapter 2. Limits of Multivariable Functions and Continuity 78 


Corollary 2.9 


Ket Od 


R" > | 


IR be a quadratic form. If @ is positive definite, then there 


exists a positive constant c such that 


Q(x) = ellx||? for all x €] 


In fact, c can be any positive number that is less than or equal to the smallest 


eigenvalue of the symmetric matrix A associated to the quadratic form Q. 


Chapter 2. Limits of Multivariable Functions and Continuity 7g 


2.2 Limits of Functions 


In this section, we study limits of multivariable functions. 


Definition 2.6 Limits of Functions 


Let D be a subset of IR” and let xp be a limit point of D. Given a function 


F : D — RY”, we say that the limit of F(x) as x approaches Xo is v, 
provided that whenever {x;} is a sequence of points in D \ {xo} that 


converges to Xo, the sequence {F(x;,)} of points in R™” converges to the 


point v. 


If the limit of F : ® — R"™ as x approaches xo is v, we write 


lim F(x) =v. 


x—>XO 


Example 2.10 


For 1 <i <n, let 7; : R” > Rbe the projection function 7;(x1,...,2n) = 
x;. By the theorem on componentwise convergence of sequences, if {x;} 


is a sequence in R” \ {x} that converges to the point xo, then 


lim 7;(xz) = 7(Xo). 
k-+00 


This means that 


lim 7;(x) = 7;(xo). 
xX>XO 


From the theorem on componentwise convergence of sequences, we also obtain 
the following immediately. 


Chapter 2. Limits of Multivariable Functions and Continuity 80 


Proposition 2.10 


Let D be a subset of IR” and let xo be a limit point of ©. Given a function 
F:9 > R”, 


eee 


if and only if for each 1 < 7 < m, 


him #7 (<<) — a, (7). 


x—>XQ 


Example 2.11 


Let f : R” — R be the function defined as f(x) = ||x||. If xo is a point in 
R”, find lim f(x). 
x—>xXo 


Solution 


We have shown in Example 1.15 that If {x;} is a sequence in R” \ {xo} 
that converges to Xo, then 


Jim. [xxl = [Pll 


Therefore, lim f(x) = ||xol]. 
x—>X0 


By the limit laws for sequences, we also have the followings. 


Proposition 2.11 


Let F: D > R™” and G: D > R” be functions defined on D C R”. If xo 
is a limit point of D and 


limb ice) — lim G(x) =v, 


x—>Xo x XO 


then for any real numbers a and /, 


lim (aF + 6G)(x) = au + bv. 


xX—>Xo 


Chapter 2. Limits of Multivariable Functions and Continuity 


81 


Proposition 2.12 


Let f :9 — Rand g: 9D — R be functions defined on D C R”. If xp isa 
limit point of D and 


If g(x) #0 


Example 2. 


Niatey (Se) — te lim g(x) =v, 
x—->XO x>X0 


lim (fg)(x) = wv. 


x Xo 


for all x € D, and v ¥ 0, then 


lim 
xX—>XO 


12 


..,kn) is a k-tuple of nonnegative integers, the monomial 


can be written as a product of the projection functions 7; : | 


Tl <i — a, 


pre) = =a 


1 <i<n. By Proposition 2.12, 


lim px(X) = px(Xo) 


x->Xo 


for any Xp in R”. If p : R” — R is a polynomial, it is a finite linear 


combination of monomials. Proposition 2.11 then implies that for any xo 
in R”, 


Iff:D1 


lim p(x) = p(Xo). 


xX>XO 


R, f(x) = p(x)/q(x) is a rational function which is equal to the 


quotient of the polynomial p(x) by the polynomial g(x), then Proposition 
2.12 implies that 


lim f(x) = f(Xo) 


xX—>XQ 


for any Xp € D = {x € R”| q(x) FO}. 


Chapter 2. Limits of Multivariable Functions and Continuity 82 


Example 2.13 


2 3 Dy? 
Ends Coen ees, 


(z,y)>(1,—1) ye sr Ue 


Solution 


lim (2? + 3ay + 2y?) =1-3+2=0, 
(z,y)>+(1,-1) 


hin ee 
Eppes @ ) 
we find that 
x? + 3xy + 2y? =! 


hi = 
em soeD fe ap OF 2 


It is easy to deduce the limit law for composite functions. 


Proposition 2.13 


Let D be a subset of R”, and let U/ be a subset of R*. Given the two 
functions F : D — R* and G:U > R”, if F(D) CU, we can define the 
composite function H = GoF : D > R” by H(x) = G(F(x)). If xo is 
a limit point of D, yo is a limit point of U, F(D \ {xo}) CU \ {yo}, 


lim F(x) = yo, 


xX—>XO 


lin Ei) ini ( Go Pix 


xX—>X0O 


The proof repeats verbatim the proof of the corresponding theorem for single 
variable functions. 


Example 2.14 


sin(2x? + 3y’) 
(x,y) +(0,0) 2a? + 3y? 


Find the limit lim 


Chapter 2. Limits of Multivariable Functions and Continuity 83 


x? + 3xry + 2y? 
gz? + y? 


Figure 2.1: The function f(z, y) = in Example 2.13. 


Solution 
Since 


lim (227 + 3y7) =2x0+3x0=0, 
(x,y) +(0,0) 


the limit law for composite functions implies that 


sin(2x? + 3y”) 
1m oo 
(«,y)>(0,0) 2a? + 3y? 


__ sin(2x* + 3y”) 


Figure 2.2: The function f(z, y) = Dn? 4 By? in Example 2.14. 
zt i] 


Let us look at some examples where the rules we have studied cannot be 
applied. 


Chapter 2. Limits of Multivariable Functions and Continuity 84 


Example 2.15 


Determine whether the limit lim 8 —W— 
(x,y)+(0,0) 22 + y? 


When (x,y) — (0,0), g(x,y) = x? + y? — 0. Hence, we cannot apply 
limit law for quotients of functions. 


Consider the sequences of points {u;} and {v;,} in R? \ {0,0} given by 


1 1 
uz = (;. Vi = (0 z) 


Notice that both the sequences {u;,} and {v;,} converge to (0,0). If 
na. of OY) = a, then both the sequences {f(u,)} and {f(v,)} 
x,y = ? 


should converge to a. Since 
(day) — alt f (ve) = 2 for all k € Z*, 


the sequence {f(u,)} converges to 1, while the sequence {f(v,;)} 
converges to —2. These imply that a = 1 and a = —2, which is a 


<< ee x? — Qy? 
contradiction. Hence, the limit lim 


(x,y)—+(0,0) tye does not exist. 


Example 2.16 


Determine whether the limit lim eee exists. 
(«,y)+(0,0) £7 + 2y? 


Chapter 2. Limits of Multivariable Functions and Continuity 85 


2 2 


2 
Figure 2.3: The function f(z, y) = ae in Example 2.15. 
zt y 


Solution 


LY 


f(z,y) = eae 


Consider the sequences of points {u;} and {v;,} in R? \ {0,0} given by 


(4 fii 
Ux = me ’ Mika nae ’ 


Notice that both the sequences {u;,} and {v;,} converge to (0,0). If 
f(x,y) = a, then both the sequences {f(u,)} and {f(v,)} 


lim 
(x,y) (0,0) 
should converge to a. Since 


7(u,) — 0, ve) = for all k € Z*, 


the sequence {f(u,)} converges to 0, while the sequence {f(v;)} 
converges to 1/3. These imply that a = 0 and a = 1/3, which is a 


contradiction. Hence, the limit lim meen does not exist. 
(x,y) (0,0) a? + 2y? 


Chapter 2. Limits of Multivariable Functions and Continuity 86 


Figure 2.4: The function f(z, y) = in Example 2.16. 


ry 
72 + Qy? 
Example 2.17 

a xy” 
Determine whether the limit lim © —— exists. 
(ey)+(0,0) 2 + 2y4 
Solution 

2 


en 
f(x,y) im x2 + Qy4 


Consider the sequences of points {u;} and {v;,} in R? \ {0,0} given by 


f 1 ek 

— = VL= ee 

k ie k io e ’ 

Notice that both the sequences {u;,} and {v;,} converge to (0,0). If 


lim f(x,y) = a, then both the sequences {f(u,)} and {f(v;)} 
(x,y) (0,0) 


should converge to a. Since 


f(y) 0; in — : for all k € Zt, 


the sequence {f(u,)} converges to 0, while the sequence {f(v;)} 
converges to 1/3. These imply that a = 0 and a = 1/3, which is a 


2 
contradiction. Hence, the limit lim y 


(e,y)—>(0,0) £2 + 2y4 does not exist. 


Chapter 2. Limits of Multivariable Functions and Continuity 87 


Figure 2.5: The function f(z, y) = in Example 2.17. 


x 
72 + iy" 


Example 2.18 


2 


Determine whether the limit lim ——— exists. 
(,y)+(0,0) &? + 2y? 


Solution 
ry” 


1) Lae 


If {(xp, yp) } is a sequence of points in R? \ {0,0} that converges to (0,0), 
then 


2 
Yk 
= eee . 
lf (es Ye) | lel 2ye < [2 


The sequence {x;,} converges to 0. By squeeze theorem, the sequence 
Xk, Ye) } also converges to 0. This proves that 
f ] g 0. This p h 
2 
lim ea 0 
(e,y)+(0,0) x? + 2y? 


Similar to the single variable case, there is an equivalent definition of limits in 
terms of € and 6. 


Chapter 2. Limits of Multivariable Functions and Continuity 88 


Figure 2.6: The function f(z, y) = in Example 2.18. 


x 
42 + Qy? 


Theorem 2.14 Equivalent Definitions for Limits 


Let D be a subset of IR”, and let xo be a limit point of D. Given a function 


F : D — R”, the following two definitions for 


lim F(x) =v 


xX>XO 


are equivalent. 


(i) Whenever {x;} is a sequence of points in D \ {xo} that converges to 
Xo, the sequence {F(x;,)} converges to v. 


(ii) For any ¢ > 0, there is ad > O such that if the point x is in D and 
0 < ||x — xo|| < 6, then |/F(x) — v|| < e. 


We will prove that if (ii) holds, then (i) holds; and if (i1) does not hold, then 
(i) also does not hold. 


First assume that (ii) holds. If {x;,,} is a sequence in D \ {xo} that converges 


to the point x9, we need to show that the sequence {F (x;)} converges to v. 
Given ¢ > 0, (ii) implies that there is a 6 > 0 such that for all x that is in 
®D \ {xo} with ||x — xo|| < 6, we have ||F(x) — v|| <e. 


Chapter 2. Limits of Multivariable Functions and Continuity 89 


Since {x;,} converges to Xo, there is a positive integer A such that for all 
k > K, ||x~ — Xo|| < 6. Therefore, for all k > K, ||F(x,) — v|| < ¢. This 
shows that the sequence {F(x;,)} indeed converges to v. 


Now assume that (11) does not hold. Then there is an ¢ > 0 such that for any 
6 > O, there is a point x in D\ {Xo } with ||x—xo|| < 6 but ||F(x;)—v|| > e. 
For this ¢ > 0, we construct a sequence {x;,} in D \ {xo} in the following 


way. For each positive integer k, there is a point x; in D \ {xo} such that 
||_x —xo|| < 1/k but ||F(x;) —v|| > e. Then {x;} is a sequence in D \ {xo} 
that satisfies 

\|x — xo|| < 1/k for all k € Z*. 


Hence, it converges to xo. Since ||F(x,) — v|| > ¢ for all k € Zr", the 
sequence {F(x;) } cannot converge to v. This proves that (i) does not hold. 


We can give an alternative solution to Example 2.18 as follows. 
Alternative Solution to Example 2.18 


2 
eee 
eu an a? + Qy?" 


Given ec > 0, let 6 =e. If (x, y) is a point in R? \ {(0,0)} such that 


ye oe OF == | (42,0) 1010) || ae =e, 


then |x| < e. This implies that 


2 


|f(z,y) — 0| = |e 


Hence, 
Te 
lim 
0,0 


—?__—9, 
(x,y) (0,0) 2? + Qy? 


Chapter 2. Limits of Multivariable Functions and Continuity 90 


Exercises 2.2 


Question 1 


Determine whether the limit exists. If it exists, find the limit. 
Ay? —_ y? 
nh 
(x,y)+(1,2) 22 + y? 


(a) 


Ag? — y? 
lim ———- 
(ey)>(1,2) Ya? + y? 


4 iD) iD 
(c) lim ao 
(x,y)3(1,2) Vx? + y? 


Question 2 


(b) 


Determine whether the limit exists. If it exists, find the limit. 


go ae y° 
1m —— 
(x,y)—>(0,0) 2? + y? 


(a) 


2 3 

(b) im cae 
(wy) (0,0) 22 + y? 

gee 


(c) 


li ———— 
(xy)-»(00) 4a? + y? 


er ty? _ 4 


(d) 


a ———— 
(x,y)+(0,0) 4x? + y? 


Question 3 
Determine whether the limit 
1m — = 
(e,y)+(0,0) 4a? + y4 


exists. If it exists, find the limit. 


Chapter 2. Limits of Multivariable Functions and Continuity 91 


Question 4 


Determine whether the limit 


cos(4? + y* — 2) —1 
1m 
(e701)  (@? + y? — 2)? 


exists. If it exists, find the limit. 


Question 5 


eee ’ ee ee x 
Let xo be a point in R”. Find the limit lim —. 
x->xo ||| 
Question 6 


Let D be a subset of R”, and let f : D — Rand G : D — R” be functions 
defined on D. We can define the function H : 9 — R”™ by 


for all x € D. 
If xo is a point in D and 


jim f(x) =a, jim G(x) =v, 
show that 


lim H(x) = av. 
x—>X0 


Chapter 2. Limits of Multivariable Functions and Continuity 92 


2.3. Continuity 


The definition of continuity is a direct generalization of the single variable case. 


Definition 2.7 Continuity 


Let D be a subset of IR” that contains the point xo, and let F : 9 — R™ be 
a function defined on D. We say that the function F is continuous at xo 


provided that whenever {x;} is a sequence of points in D that converges to 


Xo, the sequence {F(x;)} converges to F (xo). 


We say that F : 9 — R” is a continuous function if it is continuous at 
every point of its domain 9. 


From the definition, we obtain the following immediately. 


Proposition 2.15 Limits and Continuity 


Let D be a subset of IR” that contains the point xo, and let F : D > | 
a function defined on D. 


1. If xp is an isolated point of D, then F is continuous at xo. 
2. If xo is a limit point of D, then F is continuous at xo if and only if 


lim F(x) = F(x). 


x—>XQ 


Example 2.19 


Example 2.10 says that for each 1 < 2 < n, the projection function 7; : 


R” > R, 7;(x) = aj, is a continuous function. 


Example 2.20 


Example 2.11 says that the norm function f : | 


continuous function. 


From Proposition 2.10, we have the following. 


Chapter 2. Limits of Multivariable Functions and Continuity 


93 


Proposition 2.16 


Let D be a subset of IR” that contains the point xo, and let F : D > R” 


be a function defined on D. The function F :D —> | 


if and only if each of the component functions F; = (7; 0 F) : D > R, 


1 <j <™, is continuous at xo. 


Example 2.21 


The function F : | IR? 


F(x,y,2) = (#, 2), 


IR’ is continuous at xo 


is a continuous function since each component function is continuous. 


Proposition 2.11 gives the following. 


Proposition 2.17 


Let F: 9 > R” andG: D — R" be functions defined on D C | 
let xp be a point in D. If F: D — R™” and G : D — R” are continuous at 
Xo, then for any real numbers a and /, the function (aF + 6G) :D > R™ 


is continuous at xo. 


Proposition 2.12 gives the following. 


Proposition 2.18 


Let f :D — Rand g: D — R be functions defined on D C R", 


Xo be a point in. Assume that the functions f : 9 — | 


are continuous at Xo. 


1. The function (fg): 9 —- 


2. If g(x) # 0 for all x € D, then the function (f/g) : D —- | 


continuous at xo. 


R is continuous at Xo. 


Example 2.12 gives the following. 


Randg:D—>R 


R”, and 


and let 


Chapter 2. Limits of Multivariable Functions and Continuity 94 


Proposition 2.19 


Polynomials and rational functions are continuous functions. 


Since each component of a linear transformation T : R” — R" is a polynomial, 


we have the following. 


Proposition 2.20 


A linear transformation T : R™ is a continuous function. 


Since a quadratic form @ : R" — Ris a polynomial, we have the following. 


Proposition 2.21 


A quadratic form @ : R" — R given by 
OR) yaa, 
i=1 j=1 


is a continuous function. 


The following is obvious from the definition of continuity. 


Proposition 2.22 


Let D be a subset of R”, and let F : S — R"” be a function that is 
continuous at the point x» € D. If D, is a subset of D that contains xo, 


then the function F :, — R" is also continuous at xo. 


Example 2.22 
Let D be the set 


eS Ge) ee ee 


and let f : ® — R be the function defined as 


rY 
1 — 2? — y?" 


f(z,y) = 


Chapter 2. Limits of Multivariable Functions and Continuity 95 


Since fi(z,y) = ry and fo(z,y) = 1 — x? — y? are polynomials, they 


are continuous. Since fo(z,y) 4 0 for all (z,y) € D, ff : D > Risa 
continuous function. 


Figure 2.7: The function f(x,y) = 5 in Example 2.22. 


l-gz-y 
Proposition 2.13 implies the following. 


Proposition 2.23 


Let D be a subset of R”, and let U/ be a subset of R*. If F: D > R* 
and G : U > R"™ are functions such that F(D) CU, F : D > R* is 
continuous at x9, G : UY — R” is continuous at yo, then the composite 


function H = (Go F) : D > R" is continuous at xp. 


A direct proof of this theorem using the definition of continuity is actually 


much simpler. 


If {x;} is a sequence of points in D that converges to xo, then since F : 


® — R* is continuous at xo, {F(x;,)} is a sequence of points in U/ that 


converges to yo. Since G : U > R" is continuous at yo, {G(F(x;,))} is a 


sequence of points in R™ that converges to G(yo) = G(F(xo)). 


Chapter 2. Limits of Multivariable Functions and Continuity 96 


In other words, the sequence {H(x;) } converges to H(x,). This shows that 
the function H = (Go F) : 9 > R" is continuous at xo. 


Figure 2.8: Composition of functions. 


Corollary 2.24 


Let 9 be a subset of IR”, and let x9 be a point in 9. If the function F : 
® — IR” is continuous at xp € 9D, then the function ||F|| : D — R is also 


continuous at Xo. 


Figure 2.9: The function f(z, y) = |x? — y?|. 


Chapter 2. Limits of Multivariable Functions and Continuity 97 


Example 2.23 


The function f : R? > R, f(x,y) = |x? — y?| is a continuous function 


since f(x, y) = |p(z, y)|, where p(x, y) = x? —y? is a polynomial function, 
which is continuous. 


Example 2.24 


Consider the function f : R? > R, f(z,y) = Ve?*¥+ 22+ y?. Notice 
that f(x, y) = ||F(z, y)||, where F : R? > R? is the function given by 


F(x,y) = (e,2,y). 


Since g(z,y) = xy is a polynomial function, it is continuous. Being a 
composition of the continuous function h(x) = e” with the continuous 
function g(x,y) = zy, Fi(z,y) = (ho g)(z,y) = e* is a continuous 
function. The functions F>(x,y) = x and F3(x,y) = y are continuous 


functions. Hence, F : R? — R? is a continuous function. This implies that 


f : R’ > Ris also a continuous function. 


Figure 2.10: The function f(x, y) = \/e2"¥ + a? + y?. 


Chapter 2. Limits of Multivariable Functions and Continuity 98 


Example 2.25 


We have shown in volume I that the function f : R — | 


sin x ; 
; if 2 = 0, 
6 


; if x =0, 


is a continuous function. Define the function h : R® > R by 


sin(x? + y? + 27) 
x? + y? + 22 ’ if (Zu) # (O070), 


if if @, Y, Zz) al (0, 0, 0). 


Since h = f og, where g : R® > R is the polynomial function g(z, y, z) = 


x? + y? + 2’, which is continuous, the function h : R? > R is continuous. 


The following gives an equivalent definition of continuity in terms of € and 6. 


Theorem 2.25 Equivalent Definitions of Continuity 


Let D be a subset of R”, and let xo be a limit point of D. Given a function 


F : 9 — R”, the following two definitions for the continuity of F at xo 
are equivalent. 


(i) Whenever {x;,} is a sequence of points in that converges to xo, the 


sequence {F(x;)} converges to F (xo). 


(ii) For any ¢ > 0, there is a d > O such that if the point x is in D and 
\|x — Xo|| < 6, then ||F(x) — F(xo)|| < e. 


The proof is left as an exercise. Notice that statement (ii) can be reformulated 
as follows. For any ¢ > 0, there is a 6 > O such that if the point x is in D and 
x € B(x, 6), then F(x) € B(F(xo),¢). 

Now we want to explore another important property of continuity. 


Chapter 2. Limits of Multivariable Functions and Continuity 99 


Figure 2.11: The definition of continuity in terms of € and 0. 


Theorem 2.26 


Let O be an open subset of 


R”, and let F: O > 


on O. The following are equivalent. 


(a) F : O — R" is continuous. 


(b) For every open subset V of 


R’™ be a function defined 


R™, F—-'(V) is an open subset of 


Note that for this theorem to hold, it is important that the domain of the 


function F is an open set. 


Assume that (a) holds. Let V be an open subset of IR”, and let 


U=F1(V) = {xe O|F(x) eV}. 


We need to show that U is an open subset of 


R”. If xo is in U, then it 


is in O. Since O is open, there exists rg > 0 such that B(xo,r9) C O. 


Since yo = F(xo) is in V and V is open, there exists ¢ > 0 such that 
B(yo,e) C V. By (a), there exists 6 > O such that for any x € O, if 
\|x — Xo|| < 6, then ||F(x) — F(xo)|| <e. 


Chapter 2. Limits of Multivariable Functions and Continuity 100 


Take 7 — min{o,rg;. Dheny = U.7 =< 7,.and r= 6, i xis im B(x), 7), 
then x € O and ||x — xol| < r < 6. It follows that ||F(x) — F(xo)|| < e. 
This implies that F(x) C B(yo,e) C V. Thus, x € U. In other words, 
we have shown that B(xo, 1) is contained in U. This proves that U is open, 
which is the assertion of (b). 


Conversely, assume that (b) holds. Let xo be a point in O, and let yp = 
F (xo). Given « > 0, the ball V = B(yo, ) is an open subset of R™. By 


(b), U = F~'(V) is open in R”. By definition, U is a subset of O. Since 
F(x) is in V, xo is in U. Since U is open and it contains Xo, there is an 
r > 0 such that B(x,r) C U. Take 6 = r. Then if x is a point in O and 
|x — Xo|| < r, x € B(xo,r) C U. This implies that F(x) € V = B(yo,¢). 
Namely, ||F (x) — F(xo)|| < ¢. This proves that F : O — R” is continuous 
at Xp. Since Xq is an arbitrary point in O, F : O > R"™ is continuous. 


Using the fact that a set is open if and only if its complement is closed, it is 
natural to expect the following. 


Theorem 2.27 


Let A be a closed subset of IR”, and let F : A — R” be a function defined 
on A. The following are equivalent. 


(a) F: A > R” is continuous. 


(b) For every closed subset C of R™, F~'(C) is a closed subset of 


Assume that (a) holds. Let C' be a closed subset of IR’, and let 


De Be (C= {ie AF Gre a 


We need to show that D is a closed subset of R”. If {x;,} is a sequence in 
D that converges to the point xp in R”, since D C A and A is closed, xp is 
in A. Since F is continuous at x9, the sequence {F(x;)} is a sequence in 
C that converges to the point F(x) in R™. Since C is closed, F(xo) is in 
C’. Therefore, xo is in D. This proves that D is closed. 


Chapter 2. Limits of Multivariable Functions and Continuity 101 


Conversely, assume that (a) does not hold. Then F : A > R” is not 
continuous at some xo € A. Thus, there exists ¢ > 0 such that for any 6 > 
0, there exists a point x in. AM B(xo, 6) such that ||F (x) — F(xo)|| > ¢. For 
k € Z*, let x;, bea point in AN B(x, 1/k) such that ||F (x;)—F(xo)|| > e. 


Since 


i! 
kx — Xol| < i for allk € Z*, 


the sequence {x;} is a sequence in A that converges to xo. Let 


C ={y ER" | ly — F(xo)|] >}. 


Then C is the complement of the open set B(F (x0), €). Hence, C is closed. 
It contains F(x,) for all & € Z*, but it does not contain F(x). Thus, the 
set D = F~!(C) contains the sequence {x;,}, but does not contain its limit 
Xo. This means D is not closed. Therefore, (b) does not hold. 


There is a much easier proof of Theorem 2.27 if A = R”, using Theorem 2.26, 
and the fact that a set is closed if and only if its complement is open. 
Theorem 2.26 and Theorem 2.27 provide useful tools to justfy that a set is 


open or closed in IR”, using our known library of continuous functions. 
Example 2.26 
Let A be the subset of R? given by 
A= { (x,y) |e eye = 20) ay = zt ' 


Show that A is open. 


Solution 


Let O = {(x,y)|xz? + y? < 20}. This is a ball of radius /20 centered 
at the origin. Hence, © is open. Define the function f : O — R by 


f(v,y) = y — x”. Since f is a polynomial, it is continuous. Notice that 
y > x? if and only if f(x,y) > 0, if and only if f(z, y) € (0,00). This 
shows that A = f~1((0,00)). Since (0,00) is open in R, Theorem 2.26 
implies that A is an open set. 


Chapter 2. Limits of Multivariable Functions and Continuity 102 


Figure 2.12: The set A in Example 2.26. 


Example 2.27 


Let C be the subset of R® given by 
C= {(z,y,2)|¢20,y>0,y° + 2° < 20}. 
Show that C’ is closed. 


Solution 


Let 7, : R® + Rand 7z, : R® — R be the projection functions 


Ty(2,Y, 2) = x and m,(x, y, 2) = y, and consider the function g : R? + R 
defined as 
G(X, y, eS) = 20 - (y? at 2) 


Notice that y? + 2* < 20 if and only if g(x,y,z) > 0, if and only if 
g(x,y, z) € I = [0, co). The projection functions 7, and 7, are continuous. 
Since g is a polynomial, it is also continuous. The set J = [0, 00) is closed 


in R. Therefore, the sets 7,1 (I), 7, (I) and g~'(J) are closed in R®. Since 


Aan (Dna (ng, 


being an intersection of three closed sets, A is closed in | 


Using the same reasonings, we obtain the following. 


Chapter 2. Limits of Multivariable Functions and Continuity 103 


Theorem 2.28 


Let J;,..., J, be intervals in R. 


1. If each of J;,...,J, are open intervals of the form (a,b), (a, oo), 


(—oo, a) or R, then J; x --- x J, is an open subset of R”. 


2. If each of I,,...,J, are closed intervals of the form [a,b], [a, co), 


(—oo, a] or R, then J; x --- x I, is a closed subset of R”. 


Sketch of Proof 
Use the fact that 


1 Os 299 2K le = (a), 
sl 


where 77; : | R is the projection function 7;(2%1,...,%n) = %. 


Example 2.28 


The set 
A= jeg. Ze <0, y= 10 < 2 = 3) 


is open in R®, since 
A = (—00,0) x (2,00) x (—10, —3). 


The set 
C—AG@ well =09 2 2)—-10 = = —3) 


is closed in R?, since 


C' = (—00, 0] x [2, 00) x [—10, —3]. 


We also have the following. 


Chapter 2. Limits of Multivariable Functions and Continuity 


104 


Theorem 2.29 


Let a and 6 be real numbers, and assume that f : R” —> | 
function. Define the sets A, B, C, D, E and F as follows. 


(a) A= {xeR* 


(b) B= {xe R” 


(c) C={x eR” 


f(x) > a} 
f(x) 2 a} 
f(x) < a} 


R is a continuous 


(d) D= {x ER"| f(x) <a} 


(28 =e eR acai (x) 2b} 


(ik {xe Ra fix) 


Then A, C and E are open sets, while B, D and F are closed sets. 


The proof is left as an exercise. 
Example 2.29 


Find the interior, exterior and boundary of each of the following sets. 


(a) A= {(x,y)|0 <a? +4y? < 4} 


(b) B= {(a,y)|0 <a* + 4y° < 4} 


(re — 4G) ae ay ay 


Figure 2.13: The sets A, B and C defined in Example 2.29. 


Chapter 2. Limits of Multivariable Functions and Continuity 105 


Solution 
Let 


D= {(z,y)|2?+4y° < 4}, E = {(a,y)|27 + 4y? > 4}, 


and let f : R? — R be the function defined as 
Cw) ae eur. 


Since f is a polynomial, it is continuous. By Theorem 2.29, A, D and E 
are open sets and C’' is a closed set. Since A C B and D C C, we have 


A=intA CinttB CB, Dc intC. 


Since FE = R?\C Cc R?\ BCR?’ \ A, We have 


EBE=extC CextB CextA. 


fd) fe eg 


Then R” is a disjoint union of D, E and F’. If up = (20, yo) € F,, either 
xo # 0 or yo F O, but not both. If zp ¥ 0, define the sequences {u,} and 


{VE} by 
_ k _(k 
UR = fea ow ’ Vi = : 


If 7 = 0, then yo # 0. Define the sequences {u;,} and {v;,} by 


k k+1 
uy = Hy ee Yo ) Vi = 10) ae v0 . 


In either case, {u,} is a sequence of points in A that converges to uo, while 


{v;,} is a sequence of points in £ that converges to uo. This proves that ug 
is a boundary point of A, B and C. For the point 0, since it is not in A and 
B, it is not an interior point of A and B, but it is the limit of the sequence 
{(1/k, 0)} that is in both A and B. Hence, 0 is in the closure of A and B, 
and hence, is a boundary point of A and B. We conclude that 


Chapter 2. Limits of Multivariable Functions and Continuity 106 


int A = int B = {(z,y)|0<a27+4y’ <4}, 
intC = {(x,y) |x? + 4y? <4}, 


extA— extB —extC —{(a@yile +4y = 4)" 


bd A = bd B= { (x,y) |x? + 4y* = 4} U {0}, 
bd —{(a,4) | 4 — 4h 


Remark 2.1 


R be a continuous function and let 


C={xeR la = fx by. 


One is tempting to say that 


bdC = {x € R” | f(x) =a or f(x) = 0d}. 


This is not necessary true. For example, consider the set C’ in Example 
2.29. It can be written as 


C={(@y)\0<2 +47 <4} 


However, the point where f(x,y) = x? + 4y? = 0 is not a boundary point 
of C. 


Now we return to continuous functions. 


Theorem 2.30 Pasting of Continuous Functions 


Let A and B be closed subsets of IR”, and let S = AU B. If F: S — R”™ 


is a function such that F, = F|,: A— R™ and Fz = F|p: B > R”™ are 


both continuous, then F : S — R”™ is continuous. 


Chapter 2. Limits of Multivariable Functions and Continuity 107 


Since S is a union of two closed sets, it is closed. Applying Theorem 2.27, 
it suffices to show that if C is a closed subset of R™, then F~'(C) is closed 
in R”. Notice that 


F-'(C) = {x € S| F(x) € C} 
= {x € A|F(x) € C}U {x € B| F(x) € C} 
= FA'(C) UF3 (C). 


Since F, : A — R™” and Fz : B > R” are both continuous functions, 


F,'(C) and F;,'(C) are closed subsets of R”. Being a union of two closed 


subsets, F~'(C) is closed. This completes the proof. 


Example 2.30 


IR be the function defined as 


e+ y’, if 2? +y?2 <1 


f(a,y) = 
ih if 2? +y? > 1. 


Show that f is a continuous function. 


Solution 
Let A = {(z,y)|z?+y? < 1} and B = {(x,y)|x2*+y? > 1}. Then A 
and B are closed subsets of R? and R? = AUB. Notice that f|, : A > Ris 
the function f(x,y) = 27+ y?, which is continuous since it is a polynomial. 


By definition, {| : B — R is the constant function fp(x, y) = 1, which is 


also continuous. By Theorem 2.30, the function f : R? > R is continuous. 


Given positive integers n and m, there is a natural bijective correspondence 
between R” x R™ and R"t™ given by T : R” x R™ > R™™, 


(x, y) > (Pi ete a iy ences 


where 
SSO ijerity) and! = Wigniigta 


Chapter 2. Limits of Multivariable Functions and Continuity 108 


Hence, sometimes we will denote a point in R"*” as (x,y), where x € R” and 


y € R”. By generalized Pythagoras theorem, 


If A is a subset of 
of R"*™ given by 


lly)? = Ix? + lly’. 


R”, B is a subset of 


R™, A x Bcan be considered as a subset 


Ax B={(x,y)|x © A,y € B}. 


The following is more general than Proposition 2.16. 


Proposition 2.31 


Let D be a subset of 


R”, and let F : D > R* andG : D > R’ be functions 


defined on D. Define the function H : D — R**! by 


Then the function H : 9 > 
F°o]> Rand Go > 


R*+! is continuous if and only if the functions 


R! are continuous. 


Sketch of Proof 
This proposition follows immediately from Proposition 2.16, since 


lg Eee re aCne  (enoaie 


For a function defined on a subset of R", we can define its graph in the 


following way. 


Definition 2.8 The Graph of a Function 


Let F : 9D > 


denoted by Gy, is the subset of 


R™ be a function defined on © C R”. The graph of F, 
R"*™ defined as 


Gr = {(x,y)|x € D,y = F(x)}. 


Chapter 2. Limits of Multivariable Functions and Continuity 109 


Example 2.31 


Let D = {(x,y)|a? + y? < 1}, and let f : D > R be the function defined 
as 


f(@,y) = V1 — a? — y?. 


The graph of / is 


Ci = {(zy,2)|2* +9 <1lz= Ji-=y, 


which is the upper hemisphere. 


nat S ) 
Betty 
et Oh 


Figure 2.14: The upper hemisphere is the graph of a function. 


Notice that if D is a subset of IR”, then the graph of the function F :® — R” 
is the image of the function H : D — R”"*™ defined as 


H(x) = (x, F(x)). 
From Proposition 2.31, we obtain the following. 


Corollary 2.32 


Let D be a subset of IR”, and let F : 9 — R"”™ be a function defined on D. 
The image of the function H : D > R”"*”, 


H(x) = (x, F(x)) ’ 


is the graph of F. If the function F : ® — R" is continuous, then the 


function H : D > R”*”” is continuous. 


Now we consider a special class of functions called Lipschitz functions. 


Chapter 2. Limits of Multivariable Functions and Continuity 110 


Definition 2.9 


Let D be a subset of R”. A function F : 9 — R” is Lipschitz provided 
that there exists a positive constant c such that 


\|F(u) — F(v)|| < cllu — v|| for all u,v € D. 


The constant c is called a Lipschitz constant of the function. If c < 1, then 


F : 9 — R" is called a contraction. 


The following is easy to establish. 


Proposition 2.33 


Let D be a subset of IR”, and let F : © — R” be a Lipschitz function. 
Then F : 9 — R" is continuous. 


Example 2.32 


A linear transformation of the form T : R” — | 
Lipschitz function with Lipschitz constant |a]. 


In fact, we have the following. 


Theorem 2.34 


A linear transformation T : R™ is a Lipschitz function. 


Let A be the m x n matrix such that T(x) = Ax. When x is in| 


| P(x)? = (Ax) (Ax) = x"(ATA)x. 


The matrix B = A’ A is a positive semi-definite n x m symmetric matrix. 
By Theorem 2.7, 
x7(ATA)X S Amax|||*, 


where Amax is the largest eigenvalue of A? A. 


Chapter 2. Limits of Multivariable Functions and Continuity 111 


Therefore, for any x € R”, 


Tx) |] < VAmaxllXl.- 


It follows that for any u and v in R”, 


Ta) — Tv) |] = [Ta — v) I< VAmexllu — v1. 


Hence, T : R” — R"” is a Lipschitz mapping with Lipschitz constant 


V Re 


Example 2.33 


R? be the mapping defined as 
T(a,y) = (a — 3y, 7x + 4y). 
Find the smallest constant c such that 


\[T(u) — T(v)|| < elu — v|| 


for all u and v in| 
Solution 


Notice that T(u) = Au, where A is the 2 x 2 matrix A = 


|T (uw) |? =u? A? Au = u’Cu, 


C= is Be Rb 22) oe |e EL 
—3 4|/|7 4 25 25 jel 


2 iI 
For the matrix G = i i , the eigenvalues are the solutions of 


Chapter 2. Limits of Multivariable Functions and Continuity 112 


MW -—3A+1=0, 
which are 
and A»y= 


Hence, 


25(3 + V5) 
F@)r < —.—— InP 
The smallest c such that ||T(u) — T(v)|| < clu — v]| for all u and v in| 


1S 


YES WG 
c= sot) = 8.0902. 


Remark 2.2 


If A is an m x n matrix, the matrix B = A’ A is a positive semi-definite 
nm X m symmetric matrix. Thus, all its eigenvalues are nonnegative. Let 


Ai,---;An be its eigenvalues with 


OS Ag So Ape Ap S Apa Se Se, 


Then \,,--- , A, are the nonzero eigenvalues of A’ A. The singular values 
of A are the numbers o),...,0,, where 


op=VM, l<i<r. 


Theorem 2.34 says that o, is a Lipschitz constant of the linear 
transformation T(x) = Ax. 


At the end of this section, we want to discuss the vector space of m x n 
matrices M,,,,. There is a natural vector space isomorphism between /M,,,,,, and 


R™”, by mapping the matrix A = [a;;| tox = (x,), where 


X(i-1)n+j5 = aij forl<i< Mm, 1 <q <n. 


Chapter 2. Limits of Multivariable Functions and Continuity 113 


In other words, if 


ay = (au, Qj42,.-- ; in) 
aa = (da1, 22,--- san) 
am = (Qm,1; Am,25 +++ iis 
are the row vectors of A, then A is mapped to the vector (a;, a2,...,a,,) in R™”. 


Under this isomorphism, the norm of a matrix A = |a;,] is 


and the distance between two matrices A = [a,;;| and B = [b;,| is 


d(A, B) = ||A— Bl = | 50 So (aij — 43)? 


i=1 j=l 


The following proposition can be used to give an alternative proof of Theorem 
2.34. 


Proposition 2.35 


Let A be an m X n matrix. If x is in R”, then 


|| Ax|| < |All]. 


Let aj, ..., a,, be the row vectors of A, and let w = Ax. Then 
Ww; = (aj, X) for 1 4 <7. 


By Cauchy-Schwarz inequality, 


jw;| <ljailliixl|  forl <i<m. 


Chapter 2. Limits of Multivariable Functions and Continuity 114 


lw] = fo? + wh + + 2, 
< ||| Vllar[l? + [laoll? +--+ lanl? = [Allilxll 


The difference between the proofs of Theorem 2.34 and Proposition 2.35 is 
that, in the proof of Theorem 2.34, we find that the smallest possible c such that 


|| Ax|| < c||x|| for all x in R” is the largest singular value of the matrix A. In 
Proposition 2.35, we find a candidate for c, which is the norm of the matrix A, but 
this is usually not the optimal one. 

When m = n, we denote the space of n x n matrices M,,,,, simply as M,, . 
The determinant of the matrix A = [a;;| € M,, is given by 


det A = >. SgN(7)A19(1)A20(2) *** @no(n): 


Here the summation is over all the n! permutations o of the set S,, = {1,2,...,n}, 
and sgn(c) is the sign of the permutation o, which is equal to 1 or —1, depending 
on whether o can be written as the product of an even number or an odd number 
of transpositions. For example, when n = 1, det|a] = a. When n = 2, 


Q11 412 
det = 041422 — 412491. 
a21 422 


When n = 3, 
Q11 412 413 
det |a21 a2 do3} = 411422433 + G12023031 + 213421032 
431 432 433 


— Q11423432 — 213022431 — 412421433. 


The determinant function det : M,, — R is a polynomial function on the 


variables (a;;). Hence, it is a continuous function. Recall that a matrix A € M,, 
is invertible if and only if det A 4 0. Let 


GL (n,R) = {Ae M,,| det A 4 0} 


Chapter 2. Limits of Multivariable Functions and Continuity 115 


be the subset of M,, that consist of invertible n x n matrices. It is a group under 
matrix multiplication, called the general linear group. By definition, 


GL (n, R) = det~!(R \ {0}). 


Since R \ {0} is an open subset of IR, GL (n, R) is an open subset of M,,. This 
gives the following. 


Proposition 2.36 


Given that A is an invertible n x n matrix, there exists r > 0 such that if B 


is ann x n matrix with ||B — Al| <r, then B is also invertible. 


Sketch of Proof 
This is simply a rephrase of the statement that if A is a point in the open set 
GL (n, R), then there is a ball B(A,r) with center at A that is contained in 
GL (n, R). 


Let A be ann x n matrix. For 1 <i, 7 <n, the (i, j)-minor of A, denoted by 
M,,;, is the determinant of the (n — 1) x (n — 1) matrix obtained by deleting the 
i*-row and j""- column of A. Using the same reasoning as above, we find that the 


function M/;,; : M,, — R is a continuous function. The (7, 7) cofactor C,,; of A is 
given by C;,; = (—1)’*7 M;,;. The cofactor matrix of A is C4 = [C;;]. Since each 
of the components is continuous, the function C': M,, > M.,, taking A to C’4 is 


a continuous function. 
If A is invertible, 
1 
I T 
= ——_C. 
det A 4 
Since both C_ : M, — M,, and det : M,, — R are continuous functions, and 


det : GL (n, R) + Risa function that is never equal to 0, we obtain the following. 


Theorem 2.37 


The map .% : GL (n, R) that takes A to A~! is continuous. 


Chapter 2. Limits of Multivariable Functions and Continuity 116 


Exercises 2.3 


Question 1 


Let xo be a point in R”. Define the function f : R” — R by 
f(x) = |[x — xo]. 


Show that f is a continuous function. 


Question 2 


Let O = R? \ {(0,0,0)} and define the function F : O > | 


y z 
BU) — | aang? aye ee) 


Show that F is a continuous function. 


Question 3 


Let f : R” — R be the function defined as 


if at least one of the 7; is rational, 


otherwise. 


At which point of IR” is the function f continuous? 


Question 4 


Let f : R” — R be the function defined as 


if at least one of the x; is rational, 


otherwise. 


At which point of IR” is the function f continuous? 


Chapter 2. Limits of Multivariable Functions and Continuity 117 


Question 5 


Let f : R®? > R be the function defined by 


sin(x? + 4y? + 2”) 
: 16 (es 2 0,0,0), 
fewaay eae © EOE ONO 


a, it (eta) — (0,0, 0) 


Show that there exists a value a such that f is a continuous function, and 
find this value of a. 


Question 6 


Let a and b be positive numbers, and let O be the subset of IR” defined as 
O = {x € R"|a < ||x|| < d}. 
Show that O is open. 
Question 7 
Let A be the subset of R? given by 
A= {(z,y)| sin(w + y) +2y > 1}. 


Show that A is an open set. 


Question 8 


Let A be the subset of R® given by 


A Adee le yc ooh 


Show that A is a closed set. 


Chapter 2. Limits of Multivariable Functions and Continuity 118 


Question 9 


A plane in R? is the set of all points (x, y, z) satisfying an equation of the 
form 
ax + by+cz=d, 


where (a, b,c) £ (0,0, 0). Show that a plane is a closed subset of | 


Question 10 

Define the sets A, B, C and D as follows. 
(a) A= {(a,u, 2) G AP oe = 36} 
(b) B= {(a,y,z)|x? + 4y? + 92? < 36} 


(c) C= {(a,y,z)|0 < 2? + 4y? + 92? < 36} 


(d) D= {(z,y,z)|0 <2? + 4y* + 92? < 36} 


For each of these sets, find its interior, exterior and boundary. 


Question 11 


Let a and 6 be real numbers, and assume that f : IR” — R is a continuous 


function. Consider the following subsets of R”. 
(a) A= {x € R"| f(x) > a} 
(b) B= {x ER" | f(x) >a} 


(ec) C—{X eR” Fix) = a} 


(d) D= {x ER"| f(x) <a} 


(ec) E={x € R"|a< f(x) < 6} 


(f) F = {x € R"|a< f(x) <b} 


Show that A, C and E are open sets, while B, D and F are closed sets. 


Chapter 2. Limits of Multivariable Functions and Continuity 119 


Question 12 


Let f : R? > R be the function defined as 


x+y’, if2?+y? <4 
f(z,y) = 
8 — 2? — y’, if 27 + y? > 4. 


Show that f is a continuous function. 


Question 13 


Show that the distance function on R”, d: R” x R" > R, 
d(u,v) = |ju—vil, 


is continuous in the following sense. If {u,} and {v,} are sequences in 
IR” that converges to u and v respectively, then the sequence {d(ux, v;) } 
converges to d(u, v). 


Question 14 


Let T : R? — R? be the mapping 


T(z, y) = (2 + y, 8a — y, 6x + Sy). 


Show that T : R? — R?° is a Lipschitz mapping, and find the smallest 


Lipschitz constant for this mapping. 


Question 15 


Given that A is a subset of R™ and B is a subset of R”, let C = A x B. 
Then C is a subset of R™*”. 


(a) If A is open in R” and B is open in R”, show that A x B is open in 
ees 


(b) If A is closed in R™ and B is closed in R”, show that A x B is closed 


in R@tn 


Chapter 2. Limits of Multivariable Functions and Continuity 120 


Question 16 


Let D be a subset of IR”, and let f : 9 — R be a continuous function 
defined on 9. Let A = D x R and define the function g : A  R by 


g(x,y) = y — f(x). 


Show that g : A — R is continuous. 


Question 17 


Let U be an open subset of IR”, and let f : U — R be a continuous function 
defined on U. Show that the sets 


O1={(x,y)|xeU,y< f(x)}, OQo={(% y)|xeU,y > flx)} 


are open subsets of R”*?. 


Question 18 


Let C’ be a closed subset of IR”, and let f : C + IR be a continuous function 
defined on C’. Show that the sets 


Ar={(xy)|xECy< f(x}, A.={(x y)|xeC,y> f(x} 


are closed subsets of R”*!. 


Chapter 2. Limits of Multivariable Functions and Continuity 121 


2.4 Uniform Continuity 


In volume I, we have seen that uniform continuity plays important role in single 


variable analysis. In this section, we extend this concept to multivariable functions. 


Definition 2.10 Continuity 


Let D be a subset of IR”, and let F : D — R"” be a function defined on 


D. We say that the function F is uniformly continuous provided that for 


any € > OQ, there exists 6 > O such that if u and v are points in 9 and 
|[u — v|| < 6, then 
||F(u) — F(v)|| <e. 


The following two propositions are obvious. 


Proposition 2.38 


A uniformly continuous function is continuous. 


Proposition 2.39 


Given that D is a subset of R", and D’ is a subset of D, if the function 
F : 9 > R” is uniformly continuous, then the function F : 9’ + R” is 


also uniformly continuous. 


A special class of uniformly continuous functions is the class of Lipschitz 
functions. 
Theorem 2.40 


Let D be a subset of IR”, and let F : D — R™ be a function defined on D. 
If F :D — R” is Lipschitz, then it is uniformly continuous. 


The proof is straightforward. 


Remark 2.3 


Theorem 2.34 and Theorem 2.40 imply that a linear transformation is 


uniformly continuous. 


Chapter 2. Limits of Multivariable Functions and Continuity 122 


There is an equivalent definition for uniform continuity in terms of sequences. 


Theorem 2.41 
Let D be a subset of R”, and let F : D — R” be a function defined on D. 


Then the following are equivalent. 


(i) F : 9 — R” is uniformly continuous. Namely, given « > 0, there 
exists 0 > 0 such that if u and v are points in D and ||u — v|| < 6, 


then 
Ea) Ey jl <e: 


(ii) If {u,} and {v;,} are two sequences in D such that 


lim (uz — vz) = 0, 
k—-00 


lim (F(u;,) — F(v;)) = 0. 


k- oo 


Let us give a proof of this theorem here. 


Assume that (i) holds, and {u;} and {v;,} are two sequences in D such that 


lim (uz = Vr) = (I. 
k- 00 


Given € > 0, (i) implies that there exists 6 > 0 such that if u and v are 


points in © and ||u — v|| < 6, then 


|[F(u) — F(v)|| <«. 


Since jim (uz — vz) = O, there is a positive integer K such that for all 
[o@) 


k > K, |lug — vz|| < 6. It follows that 


|F (uz) — F(vz)|| <€ for all k > Kk. 


Chapter 2. Limits of Multivariable Functions and Continuity 123 


This shows that 


jim (F(ug) — F(v«)) = 9, 


and thus completes the proof of (i) implies (ii). 

Conversely, assume that (i) does not hold. This means there exists ane > 0, 
for all 6 > 0, there exist points u and v in D such that ||u — v|| < 6 and 
||F(u) — F(v)|| > ©. Thus, for every & € Z*, there exists u; and v; in D 
such that 


pep 
and ||F(u;) — F(vx)|| > ¢. Notice that {u,,} and {v;,} are sequences in D. 
Eq. (2.5) implies that jim (u, — vz) = 0. Since |/F(u,) — F(vx)|| > ¢, 
00 
lim (F(u;) = F(v;)) # 0. 


k- oo 


This shows that if (i) does not hold, then (ii) does not hold. 


From Theorem 2.41, we can deduce the following. 


Proposition 2.42 


Let D be a subset of IR”, and let F : 9 — R” be a function defined on 
. Then F : 9 — R" is uniformly continuous if and only if each of the 


component functions F; = (7; 0 F): D > R,1 <j < m, is uniformly 


continuous. 


Let us look at some more examples. 
Example 2.34 


Let D be the open rectangle D = (0,5) x (0,7), and consider the function 
f : 9 — R defined by 


f(z, y) = xy. 


Determine whether f : 9 — R is uniformly continuous. 


Chapter 2. Limits of Multivariable Functions and Continuity 124 


Solution 
For any two points u; = (21, yi) and ug = (£2, ye) inD, 0 < 21,472 < 5 
and 0 < y1, yo < 7. Since 


f(ui) — f(ue) = 21y1 — Zaye = 21(y1 — ye) + yo(ri — £2), 
we find that 


| f (1) ze f(uz)| = [za || a yp +. lyo||ar1 — £9 
< 5][u1 — up|] + 7|}ur — ual] = 12|)u, — up]. 


This shows that f : ® — R is a Lipschitz function. Hence, it is uniformly 
continuous. 


Example 2.35 


Consider the function f : R? > R defined by 


f(z, y) = xy. 


Determine whether f : |] R is uniformly continuous. 


Solution 
For k € Z", let 


i) 
up — (i+ 5k). i eee 


Then {u;,} and {v;,} are sequences of points in R* and 
lim ( i olor 
ee ee eo 


However, 


Pug) — f(ve) =k (: rt z) ea 


Chapter 2. Limits of Multivariable Functions and Continuity 125 


Thus, 


im (f(ux) — f(ve)) =14 0. 


Therefore, the function f : 


ee 


R is not uniformly continuous. 


Example 2.34 and 2.35 show that whether a function is uniformly continuous 


depends on the domain of the function. 


Chapter 2. Limits of Multivariable Functions and Continuity 126 


Exercises 2.4 


Question 1 


Let F : R® > R?’ be the function defined as 
F(z, y, z) = (8a —2z2+7,e+y+2-4). 


Show that F : R? — R? is uniformly continuous. 


Question 2 


Let D = (0,1) x (0,2). Consider the function f : © — R defined as 


f(x,y) = 2? + 3y. 


Determine whether f is uniformly continuous. 


Question 3 


Let D = (1,00) x (1, 00). Consider the function f : © — R defined as 


f(z,y) = Vx y. 


Determine whether f is uniformly continuous. 


Question 4 
Let D = (0,1) x (0,2). Consider the function f : © — R defined as 


1 
very 


Determine whether f is uniformly continuous. 


f(x,y) = 


Chapter 2. Limits of Multivariable Functions and Continuity 127 


2.5 Contraction Mapping Theorem 


Among the Lipschitz functions, there is a subset called contractions. 


Definition 2.11 Contractions 


Let D be a subset of IR”. A function F : 9 — R"” is called a contraction if 
there exists a constant 0 < c < 1 such that 


\|F(u) — F(v)|| < cllu — v|| for all u,v € 9D. 


In other words, a contraction is a Lipschitz function which has a Lipschitz 
constant that is less than 1. 


Example 2.36 


Let b be a point in R”, and let F : R” — R” be the function defined as 
F(x) =cx+b. 


The mapping F is a contraction if and only if |c| < 1. 


The contraction mapping theorem is an important result in analysis. Extended 
to metric spaces, it is an important tool to prove the existence and uniqueness of 
solutions of ordinary differential equations. 


Theorem 2.43 Contraction Mapping Theorem 


Let D be aclosed subset of R”, and let F :  — D be acontraction. Then 
F has a unique fixed point. Namely, there is a unique u in 9 such that 
F(a) — 1 


By definition, there is a constant c € [0, 1) such that 


|F(u) — F(v)|| < elju— v|| for all u,v € D. 


Chapter 2. Limits of Multivariable Functions and Continuity 128 


We start with any point xp in D and construct the sequence {x;,} inductively 
by 
Xpa1 = F(xz) for all k > 0. 


Notice that for all k € Z*, 
I[Xe-+1 — Xl] = []FCKe) — F(xe-1) || S elle — Xe-all- 
By iterating, we find that 
[Peco Sessa 
Therefore, if | > k > 0, triangle inequality implies that 


|] Sees es Mop aps eae ected || ree Xr 


(eo ee) |. 
Since c € [0, 1), 
eu sine 


Therefore, for all! > k > 0, 


ck 
Ios: — >| < Ihe: = xl 


Given € > 0, there exists a positive integer K such that for allk > k, 


ck 


= <s. 
es lla Xo| 


This implies that for all] >k > k, 


\|x, = Xp|| <p 


Chapter 2. Limits of Multivariable Functions and Continuity 129 


In other words, we have shown that {x;,} is a Cauchy sequence. Therefore, 


it converges to a point u in R”. Since D is closed, u is in D. 

Since F is continuous, the sequence {F(x;,)} converges to F(u). But 
F(x;,) = X41. Being a subsequence of {x;,}, the sequence {x;,41} 
converges to u as well. This shows that 


F(u) = u, 


which says that u is a fixed point of F. Now if v is another point in 9 such 
that F(v) = v, then 


||u — v|] = ||F(u) — F(v)|| < elu — vy]. 


Since c € [0, 1), this can only be true if ||u — v|| = 0, which implies that 
v = u. Hence, the fixed point of F is unique. 


As an application of the contraction mapping theorem, we prove the following. 


Theorem 2.44 


Let r be a positive number and let G : B(0,r) — R” be a mapping such 
that G(O) = 0, and 


1 
||G(u) — G(v)|| < 5llu— vil for all u,v € B(O,r). 


If F : B(0,7r) — R” is the function defined as 


F(x) =x + G(x), 


then F is a one-to-one continuous mapping whose image contains the open 
ball B(0, 7/2). 


Chapter 2. Limits of Multivariable Functions and Continuity 130 


By definition, G is a contraction. Hence, it is continuous. Therefore, F : 
B(0,r) — R” is also continuous. If F(u) = F(v), then 


u—v = G(v) — G(u). 


Therefore, i 
Ju — v|| = ||G(v) — G(u)|| < glu = Wik 


This implies that ||u — v|| = 0, and thus, u = v. Hence, F is one-to-one. 
Given y € B(0,r/2), let r; = 2||y||. Then r, < r. Consider the map 
H : CB(0,1r1) — R" defined as 


H(x) = y — G(x). 


For any u and v in C'B(0,71), 


|| H(u) — H(v)|| = ||G(u) — G(v)|| s uv 


Therefore, H is also a contraction. Notice that if x € CB(0,7r1), 


r 1 Ty Ty 
HCO) < llvll + |G) GO| < F45Ix1 <4 Fn, 


Therefore, H is a contraction that maps the closed set C_B(0, 1r,) into itself. 
By the contraction mapping theorem, there exists u in C'B(0, 71) such that 
H(u) = u. This gives 
y — G(u) =u, 
or equivalently, 
y =u+ G(u) = F(u). 


In other words, we have shown that there exists u € C'B(0,7r,) C B(0,r) 
such that F(u) = y. This proves that the image of the map F : B(0,r) > 
R” contains the open ball B(0,7r/2). 


Chapter 2. Limits of Multivariable Functions and Continuity 131 


Exercises 2.5 


Question 1 


‘Sie Geen. eee) ER |aj+---+22 +274, = it 


be the n-sphere, and let F : S” — S” be a mapping such that 


2, 
|F(u) — F(v)|| < glu —v|l for all u,v € S”. 


Show that there is a unique w € S” such that F(w) = w. 


Question 2 


Let r be a positive number, and let c be a positive number less than 1. 
Assume that G : B(0,7) — R” is a mapping such that G(0) = 0, and 


||G(u) — G(v)|| < elu — v]| for all u,v € B(O,r). 


If F : B(0,r) — R" is the function defined as 
F(x) =x+ G(x), 


show that F is a one-to-one continuous mapping whose image contains the 
open ball B(0, ar), where a = 1 —c. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 132 


Chapter 3 


Continuous Functions on Connected Sets and 


Compact Sets 


In volume I, we have seen that intermediate value theorem and extreme value 
theorem play important roles in analysis. In order to extend these two theorems 
to multivariable functions, we need to consider two topological properties of sets 


— the connectedness and compactness. 


3.1 Path-Connectedness and Intermediate Value Theorem 


We want to extend the intermediate value theorem to multivariable functions. For 
this, we need to consider a topological property called connectedness. In this 
section, we will discuss the topological property called path-connectedness first, 
which is a more natural concept. 


Definition 3.1 Path 


Let S be a subset of IR”, and let u and v be two points in S. A path in S 


joining u to v is a continuos function y : [a,b] + S such that y(a) = u 


and y(b) =v. 


For any real numbers a and b with a < b, the map wu : [0,1] — [a,b] defined 
by 
u(t) =a+t(b—a) 


is a continuous bijection. Its inverse u~! : [a, b] > [0,1] is 


Oe ee" 


al) = 
w(t) =, 


which is also continuous. Hence, in the definition of a path, we can let the domain 
be any [a, b] with a < b. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 133 


Figure 3.1: A path in S joining u to v. 


Example 3.1 


Given a set S and a point xo in S, the constant function y : [a,b] S, 
y(t) = Xo, is a path in S. 


If y : [a,b] + S is a path in S C R", and S" is any other subset of R” that 
contains the image of -y, then + is also a path in S’. 


Example 3.2 


Let R be the rectangle R = [—2, 2] x [—2, 2]. The function + : [0,1] + R?, 
4(t) = (cos(zt), sin(zt)) is a path in R joining u = (1,0) to v = (—1,0). 


Figure 3.2: The path in Example 3.2. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 134 


Example 3.3 


Let S be a subset of R”. If y : [a,b] > S is a path in S joining u to v, then 


4+ : [—b, -a] > S,¥(t) = y(-), is a path in S joining v to u. 


Now we define path-connectedness. 


Definition 3.2 Path-Connected 


Let S be a subset of IR”. We say that S is path-connected if any two points 
u and v in S can be joined by a path in S. 


It is easy to characterize a path-connected subset of R. In volume I, we have 


defined the concept of convex sets. A subset S of R is a convex set provided that 
for any u and v in S and any t € [0, 1], (1—t)u+tv is also in S. This is equivalent 
to if u and v are points in S with u < v, all the points w satisfying u < w < vis 
also in S. We have shown that a subset S of R is a convex set if and only if it is 
an interval. 


The following theorem characterize a path-connected subset of R. 


Theorem 3.1 


Let S be a subset of R. Then S is path-connected if and only if S is an 
interval. 


If S is an interval, then for any wu and v in S, and for any t € [0, 1], (1 — 
t)u + tu is in S. Hence, the function y : [0,1] > S, y(t) = (1 —t)u+tu 
is a path in S that joins wu to v. 


Conversely, assume that S is a path-connected subset of IR. To show that 
S is an interval, we need to show that for any u and v in S with u < v, 
any w that is in the interval [u,v] is also in S. Since S is path-connected, 
there is a path y : [0,1] + S such that 7(0) = u and y(1) = v. Since ¥ is 
continuous, and w is in between 7(0) and (1), intermediate value theorem 


implies that there is a c € [0,1] so that y(c) = w. Thus, w is in S. 


To explore path-connected subsets of R” with n > 2, we first extend the 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 135 


concept of convex sets to R”. Given two points u and v in R”, when ¢ runs 
through all the points in the interval 0, 1], (1 — t)u + tv describes all the points 


on the line segment between u and v. 


Definition 3.3 Convex Sets 


Let S be a subset of R”. We say that S is convex if for any two points u and 
v in S, the line segment between u and v lies entirely in S. Equivalently, 


S is convex provided that for any two points u and v in S, the point (1 — 
t)u + tv isin S for any t € [0, 1]. 


Figure 3.3: A is a convex set, B is not. 


If u = (uy,...,Un) and v = (v,,...,Un) are two points in R”, the map 


y(t) = (1-—t)ut+tv = ((1—t)u, + tyy,...,(1 — thu, + tun) 


is a continuous functions, since each of its components is continuous. Thus, we 


have the following. 


Theorem 3.2 


Let S be a subset of IR”. If S is convex, then it is path-connected. 


Let us look at some examples of convex sets. 


Example 3.4 


Let I, ..., J, be intervals in R. Show that the set S = I; x --- x I, is 


path-connected. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 136 


Solution 
We claim that S is convex. Then Theorem 3.2 implies that S is path- 
connected. 
Given that u = (u,..., Un) and v = (v,..., Un) are two points in S, for 


each 1 <i <n, u; and v; are in J;. Since J; is an interval, for any ¢ € [0, 1], 


(1 — t)u; + tu, is in I;. Hence, 
(1—t)u+étv = ((1—t)u. +tu,..., (1 — tun + tun) 


is in S. This shows that S is convex. 


Special cases of sets of the form S = J, x --- x I, are open and closed 
rectangles. 


Example 3.5 
An open rectangle 


we— (a1, by) mK ooo HK (Gp On) 


and its closure 
— [a1, 61] We oo KK [a0 | 


are convex sets. Hence, they are path-connected. 


Example 3.6 


Let xp be a point in R”, and let r be a positive number. Show that the open 
ball B(xo, 1) and the closed ball C B(xo, 7) are path-connected sets. 


Solution 


Let u and v be two points in B(x9, 1). Then ||u—xo|| < rand ||v—xo|| <r. 
For any ¢ € [0,1], ¢ > 0 and 1 —t > 0. By triangle inequality, 


||(1 — t)u + tv — xl] < || — t)(u — Xo) + [l#(v — Xo) 
= (1 —#)|[u — xol| + tllv — xoll 
ee Ui Ie 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 137 


This shows that (1 — t)u + tv is in B(xo,r). Hence, B(x, 1) is convex. 


Replacing < by <, one can show that C B(xo, 1) is convex. 
By Theorem 3.2, the open ball B(xo, 7) and the closed ball C'B(xo, r) are 
path-connected sets. 


Not all the path-connected sets are convex. Before we give an example, let us 
first prove the following useful lemma. 


Lemma 3.3 


Let A and B be path-connected subsets of R”. If AM B is nonempty, then 
S = AU Bis path-connected. 


Let u and v be two points in S. If both u and v are in the set A, then 
they can be joined by a path in A, which is also in S. Similarly, if both u 
and v are in the set B, then they can be joined by a path in S. If u is in 
A and v is in B, let xo be any point in AM B. Then u and xp are both 
in the path-connected set A, and v and xo are both in the path-connected 
set B. Therefore, there exist continuous functions y, : [0,1] — A and 
yo = (1, 2|— B such that 7,(0) = uy 7,(1) — x, 7511) = xp and 
“/>(2) = v. Define the function + : [0,2] > AU B by 


if0<t<1, 
ifl <t <2 


Since [0,1] and [1, 2] are closed subsets of R, the function + : [0,2] > S 
is continuous. Thus, 7y is a path in S from u to v. This proves that S is 


path-connected. 


Now we can give an example of a path-connected set that is not convex. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 138 


Figure 3.4: If two sets A and B are path-connected and A B is nonempty, then 
AU Bis also path-connected. 


Example 3.7 
Show that the set 
S={(2,y)(02e=1,-2 <4 =2)U 4a y) |@—2) 9 =< 1) 


is path-connected, but not convex. 


Solution 
The set 


A GOs = Wa Oe) 


is a closed rectangle. Therefore, it is path- connected. The set 


B= {(2,y)|(e- 2)? +y? <1} 


is a closed ball with center at (2,0) and radius 1. Hence, it is also path- 
connected. Since the point x) = (1,0) is in both A and B, S = AU Bis 
path-connected. 

The points u = (1,2) and v = (2, 1) are in S. Consider the point 


re 3.3 
vo — Se || == 
2 Z oe, 


It is not in S. This shows that S is not convex. 


Let us now prove the following important theorem which says that continuous 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 139 


Figure 3.5: The set A U B is path-connected but not convex. 


functions preserve path-connectedness. 


Theorem 3.4 


Let 9 be a path-connected subset of R”. If F : © — R"” is a continuous 


function, then F(D) is path-connected. 


Let v, and v2 be two points in F(®). Then there exist u; and ug in D such 
that F(u,) = v, and F(uz) = vo. Since D is path-connected, there is a 
continuous function y : [0,1] + © such that +~(0) = u, and y(1) = up. 
The map @ = (Fo 7) : [0,1] + F(D) is then a conitnuous map with 
a(0) = v; and a(1) = vo. This shows that F(D) is path-connected. 


From Theorem 3.4, we obtain the following. 


Theorem 3.5 Intermediate Value Theorem for Path-Connected Sets 


Let D be a path-connected subset of IR”, and let f : 0 — R be a function 


defined on D. If f is continuous, then f(D) is an interval. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 140 


By Theorem 3.4, f(D) is a path-connected subset of R. By Theorem 3.1, 


f (®) is an interval. 


We can also use Theorem 3.4 to establish more examples of path-connected 
sets. Let us first look at an example. 


Example 3.8 


Show that the circle 
S = {agile ay — 1} 
is path-connected. 


Solution 
Define the function f : [0,27] > R* by 


i()) (cost cimy): 


Notice that S' = f({0,27)]. Since each component of f is a continuous 
function, f is a continuous function. Since [0,27] is an interval, it is path- 
connected. By Theorem 3.4, S! = f([0,27]) is path-connected. 


A more general theorem is as follows. 


Theorem 3.6 


Let 9 be a path-connected subset of IR”, and let F : 9 — R” be a function 
defined on D. If F : D — R” is continuous, then the graph of F, 


Gr a {(x, y) | x = D,y = F(x)} 


is a path-connected subset of R"*”. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 141 


By Corollary 2.32, the function H : D + R"*™, H(x) = (x, F(x)), is 


continuous. Since H(®) = Gp, Theorem 3.4 implies that G'p is a path- 


connected subset of R”*™. 


Now let us consider spheres, which are boundary of balls. 


Definition 3.4 The Standard Unit n-Sphere S” 


A standard unit n-sphere S” is a subset of R"*+ consists of all points x = 


(@1,---,2n;2n41) in R"*" satisfying the equation ||x|] = 1, namely, 


ie eee 


The n-sphere S” is the boundary of the (n + 1) open ball B’t' = B(O, 1) 
with center at the origin and radius 1. 


y te 

ae 

IX Le 
Via Gay Be ah 6%,), 
ROR 
KY Cee yy 
WARE 

\\SSSSE==5 

fa eas 

Sage 

=e== 

Ss=== 


/ 
i 
i 
| 


| 
\ 


Figure 3.6: A sphere. 


Example 3.9 


Show that the standard unit n-sphere S” is path-connected. 


Solution 


Notice that S" = S$” US", where S' and S” are respectively the upper and 


lower hemispheres with x,,,; > 0 and x,,4; < 0 respectively. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 142 


li x <5. then 


= 2 on 
Ong1 = fl — a3 —...— 23; 


whereas if x € S$”, 


= 2 
tnpt = afl —aP— = 22. 


Cle Gen, once) see tot, SN) 


be the closed ball in IR” with center at the origin and radius 1. Define the 
functions f+ : CB” — R by 


fa(t1,--5%m) = 41-9? — = 22. 


Notice that S’’ and S” are respectively the graphs of f, and f_. Since they 
are compositions of the square root function and a polynomial function, 
which are both continuous, f, and f_ are continuous functions. The closed 
ball CB” is path-connected. Theorem 3.6 then implies that S’ and S” are 
path-connected. 

Since both S” and S” contain the unit vector e; in R"*", the set 99 9" 


is nonempty. By Lemma 3.3, S” = SU S® is path-connected. 


Remark 3.1 


There is an alternative way to prove that the n-sphere S” is path-connected. 


Given two distinct points u and v in S”, they are unit vectors in R"*!. We 
want to show that there is a path in S” joining u to v. 

Notice that the line segment L = {(1—t)u+itv|0<t<1} in R™? 
contains the origin if and only if u and v are parallel, if and only if v = —u. 


Thus, we discuss two cases. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 143 


Case I: v 4 —u. 
In this case, let + : [0, 1] > R”** be the function defined as 


(1—t)hu+tv 


a) emcee 


Since (1 — t)u+ tv # 0 for all 0 < t < 1, y is acontinuous function. It is 
easy to check that its image lies in S”. Hence, + is a path in S” joining u 
to v. 

Case 2: v = —u. 


In this case, let w be a unit vector orthogonal to u, and let y : [0,7] > | 
be the function defined as 


(t) = (cost)u + (sin t)w. 


Since sin¢ and cost are continuous functions, 7y is a continuous function. 
Since u and w are orthogonal, the generalized Pythagoras theorem implies 
that 


\|-y(£)||? = cos? t||ul|? + sin? ¢||w||? = cos? ¢ + sin? t = 1. 


Therefore, the image of + lies in S”. It is easy to see that (0) = u and 
—u = v. Hence, ¥ is a path in S” joining u to v. 


Example 3.10 


Let f : S” — R be a continuous function. Show that there is a point up on 
S” such that f(ug) = f(—upo). 


Solution 


The function g : R"t' — R"*', g(u) = —u isa linear transformation. 
Hence, it is continuous. Restricted to S”, g(S”) = S”. Thus, the function 
fi: 5S" > R, fi(u) = f(—u), is also continuous. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 


144 


It follows that the function h : S” — R defined by 


This implies that if the number a is in the range of h, so does the number 


—a. Since the number 0 is in between a and —a for any a, and S” is path- 


connected, intermediate value theorem implies that the number 0 is also in 


the range of h. This means that there is an ug on S” such that h(ug) = 0. 


Equivalently, f(uo) = f(—uo). 


Theorem 3.5 says that a continuous function defined on a path-connected set 


satisfies the intermediate value theorem. We make the following definition. 


Definition 3.5 Intermediate Value Property 


Let S be a subset of IR". We say that S has intermediate value property 


provided that whenever f : S — R is a continuous function, then f(S) is 


an interval. 


Theorem 3.5 says that if S is a path-connected set, then it has intermediate 


value property. It is natural to ask whether it is true that any set S that has the 


intermediate value property must be path-connected. Unfortunately, it turns out 


that the answer is yes only when S is a subset of 


R. If S is a subset of 


R” with 


n > 2, this is not true. This leads us to define a new property of sets called 


connectedness in the next section. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 145 


Exercises 3.1 


Question 1 


Is the set A = (—1, 2) U (2, 5] path-connected? Justify your answer. 


Question 2 


Let a and b be positive numbers, and let A be the subset of R? given by 


ig y? 
A= (ow S+esit. 
Show that A is convex, and deduce that it is path-connected. 
Question 3 
Let (a, b, c) be a nonzero vector, and let P be the plane in R® given by 
P = {(a, y, z) | ax + by +cz = d}, 


where d is a constant. Show that P is convex, and deduce that it is path- 
connected. 


Question 4 


Let S' be the subset of R? given by 
S={(z,y,z)|@>0,y<1,2<2< 7}. 


Show that S' is path-connected. 


Question 5 


Let a, b and c be positive numbers, and let S be the subset of R® given by 


a 
s={(an2) o+t+ ea}. 


Show that S' is path-connected. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 146 


Question 6 


Let u = (3,0) and let A be the subset of R? given by 


AS{Gg\e ya 1h. 


Define the function f : A + R by f(x) = d(x, u). 
(a) Find f(x) and f(x2), where x; = (1,0) and x2 = (—1,0). 


(b) Use intermediate value theorem to justify that there is a point xo in A 
such that d(xo, u) = 7. 


Question 7 


Let A and B be subsets of R”. If A and B are convex, show that AM B is 


also convex. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 147 


3.2 Comnnectedness and Intermediate Value Property 


In this section, we study a property of sets which is known as connectedness. Let 


us first look at the path-connected subsets of IR from a different perpective. We 


have shown in the previous section that a subset of R is path-connected if and only 
if it is an interval. A set of the form 


A = (-2,2] \ {0} = (—2,0) U (0, 2] 


is not path-connected, since it contains the points —1 and 1, but it does not contain 
the point 0 that is in between. Intuitively, there is no way to go from the point —1 
to 1 continuously without leaving the set A. 

Let U = (—co,0) and V = (0,00). Notice that U and V are open subsets of 
R which both intersect the set A. Moreover, 


A=(ANU)U(ANV), 


or equivalently, 
ACUUYV. 


We say that A is separated by the open sets U and V. 


Definition 3.6 Separation of a Set 


Let A be a subset of R”. A separation of A is a pair (U,V) of subsets of 
R” which satisfies the following conditions. 


(a) U and V are open sets. 
(b) ANUFPand ANV F9. 
(c) AC UUV, or equivalently, A is the union of AN U and AN V. 


(d) Ais disjoint from UNV, or equivalently, ANU and AN V are disjoint. 


If (U, V) is a separation of A, we say that A is separated by the open sets 


U and V, or the open sets U and V separate A. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 148 


Example 3.11 


Let A = (—2,0) U (0, 2], and let U = (—oo, 0) and V = (0, 00). Then the 


open sets U and V separate A. 
Let U; = (—3,0) and V; = (0,3). The open sets U; and V, also separate 
A. 


Now we define connectedness. 


Definition 3.7 Connected Sets 


Let A be a subset of IR". We say that A is connected if there does not exist 
a pair of open sets U and V that separate A. 


Example 3.12 


Determine whether the set 


A= {(a,y)|y=9}U 4 (ow) » 


~ 1+ 2? 


is connected. 


Solution 
R be the function defined as 


f(v,y) = y(a? + 1). 


Since f is a polynomial function, it is continuous. The intervals Vj = 
(—1, 1) and V, = (1,3) are open sets in R. Hence, the sets U; = f~1(V,) 
and Uy = f~'(V2) are disjoint and they are open in R?. Notice that 


ANU, = {(z,y) |y = 9}, Ante = {C0 |y= a} 


Thus, AM U,; and AM U2 are nonempty, AM U; and AN Uz are disjoint, 
and A is a union of AM U; and AN U3. This shows that the open sets U; 
and U2 separate A. Hence, A is not connected. 


Now let us explore the relation between path-connected and connected. We 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 149 


Figure 3.7: The set A defined in Example 3.12 is not connected. 


first prove the following. 


Theorem 3.7 


Let A be a subset of IR”, and assume that the open sets U and V separate 
A. Define the function f : A > R by 


ifx € ANU, 
ifxEe ANY. 


Then f is continuous. 


Notice that the function f is well defined since AN U and AN V are disjoint. 


Let xo be a point in A. We want to prove that f is continuous at xo. Since 
A is contained in U U V, xo is in U or in V. It suffices to consider the case 
where Xp is in U. The case where xp is in V is similar. 


If xo is in U, since U is open, there is an r > 0 such that B(xo,7r) C 


U. If {x;,} is a sequence in A that converges xo, there exists a positive 
integer K such that for all k > K, ||x;, — Xo|| < r. Thus, for all k > K, 
x; € B(xXo,r) C U, and hence, f(x;,) = 0. This proves that the sequence 


{ f (xx) } converges to 0, which is f(xo). Therefore, f is continuous at Xp. 


Now we can prove the theorem which says that a path-connected set is connected. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 150 


Theorem 3.8 


Let A be a subset of R”. If A is path-connected, then it is connected. 


We prove the contrapositive, which says that if A is not connected, then it 


is not path-connected. 


If A is not connected, there is a pair of open sets U and V that separate A. 


By Theorem 3.7, the function f : A > R defined by 


0, ifxe ANU, 
ip ifxEe ANV 


f(x) 


is continuous. Since f(A) = {0, 1} is not an interval, by the contrapositive 
of the intermediate value theorem for path-connected sets, A is not path- 
connected. 


Theorem 3.8 provides us a large library of connected sets. 


Example 3.13 


The following sets are path-connected. Hence, they are also connected. 


1. A set S in R” of the form S = J, x --- x I, where J,...,J, are 
intervals in R. 


2. Open rectangles and closed rectangles. 


3. Open balls and closed balls. 


4. The n-sphere S”. 


The following theorem says that path-connectedness and connectedness are 


equivalent in R. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 151 


Theorem 3.9 


Let S be a subset of R. Then the following are equivalent. 


(a) S is an interval. 
(b) Sis path-connected. 


(c) S is connected. 


We have proved (a) <=> (b) in the previous section. In particular, (a) 
implies (b). Theorem 3.8 says that (b) implies (c). Now we only need to 
prove that (c) implies (a). 

Assume that (a) is not true. Namely, S is not an interval. Then there are 
points wu and v in S' with u < v, such that there is a w € (u,v) that is not 
in S. Let U = (—oo, w) and V = (w, oo). Then U and V are disjoint open 
subsets of R. Sincew ¢ S,S CUUV. Sinceue SNU andv Ee SNV, 
SU and SMV are nonempty. Hence, U and V are open sets that separate 


S. This shows that S is not connected. Thus, we have proved that if (a) is 
not true, then (c) is not true. This is equivalent to (c) implies (a). 


Connectedness is also preserved by continuous functions. 


Theorem 3.10 


Let D be a connected subset of R”. R’™” is a continuous 


function, then F(D) is connected. 


We prove the contra-positive. Assume that F(®) is not connected. Then 


there are open sets V; and V2 in R™ that separate F(D). Let 


®, ={xe€ D|F(x) e Vi}, 
Do. = {x € D| F(x) € Vy}. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 152 


Since F(D) N V, and F(D) N V2 are nonempty, D, and Dp are nonempty. 
Since F(D) C Vj UVa, D = Di UD». Since Vi NV3 is disjoint from F(D), 
D, and Dz are disjoint. However, ©; and D2 are not necessary open sets. 
We will define two open sets U; and U2 in R” such that 9, = 91 U;, and 
D2 = DN Uy. Then U; and U2 are open sets that separate D. 

For each xo in 91, F(x9) € Vi. Since Vj is open, there exists ex, > 0 
such that the ball B(F (xo), €x,) is contained in V|. By the continuity of F 
at Xo, there exists 6;, > 0 such that for all x in D, if x € B(xo, 4x,), then 
F(x) € B(F (xo), €x,) C Vi. In other words, 


5D) (7) B(Xo, Oxo) e F"(Y) = Dj. 
Notice that B(xo, 5x, ) is an open set. Define 


Oi |) Bien dee 
xgED 1 
Being a union of open sets, U; is open. Since 


DAU. = LJ @N BK, bxo)) C D1, 


xgE€D1 


Di= J {xo} c L) ON Bx, 6.)) =D, 


xo0ED1 xo0ED1 


we find that D 1 U; = Dj. Similarly, define 


T= ||) ICs, Oa 


xgEDe2 


Then U> is an open set and D M Uz = Dy. This completes the construction 
of the open sets U; and U2 that separate . Thus, D is not connected. 


From Theorem 3.9 and Theorem 3.10, we also have an intermediate value 


theorem for connected sets. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 153 


Theorem 3.11 Intermediate Value Theorem for Connected Sets 


Let D be aconnected subset of IR”, and let f : © — R be a function defined 
on ®. If f is continuous, then f(D) is an interval. 


By Theorem 3.10, f(®) is a connected subset of R. By Theorem 3.9, f(D) 
is an interval. 


Now we can prove the following. 


Theorem 3.12 


Let S be a subset of IR". Then S is connected if and only if it has the 
intermediate value property. 


If S is connected and f : S — R is continuous, Theorem 3.11 implies that 
f(S) is an interval. Hence, S has the intermediate value property. 


If S is not connected, Theorem 3.7 gives a continuous function f : S > R 
such that f(S) = {0,1} is not an interval. Thus, S does not have the 
intermediate value property. 


To give an example of a connected set that is not path-connected, we need a 
lemma. 


Lemma 3.13 


Let A be a subset of R” that is separated by the open sets U and V. If C is 
a connected subset of A, then CN U =QMorCnVv =9. 


Since C Cc A, C C UUV, and C is disjoint from UNV. If CnU 4 @ and 
COV # Q, then the open sets U and V also separate C’. This contradicts 
to C is connected. Thus, we must have CN U = @orCnVv =9. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 154 


Theorem 3.14 


Let A be a connected subset of R”. If B is a subset of R” such that 
AE Ee A, 


then B is also connected. 


If B is not connected, there exist open sets U and V in R” that separate A. 
Since A is connected, Lemma 3.13 says that AN U = or ANV = 0. 
Without loss of generality, assume that AM V = 9. Then A C R"\V. 
Thus, R” \ V is aclosed set that contains A. This implies that A C R"\V. 
Hence, we also have B C R” \ V, which contradicts to the fact that the set 
BC V is not empty. 


Example 3.14 The Topologist’s Sine Curve 


Let S be the subset of R? given by S = AU L, where 


A={ (ay |0<asty=sin(=)}, 


b= 4 aay) |e =U a se he 


(a) Show that S c A. 
(b) Show that S' is connected. 


(c) Show that S is not path-connected. 


Solution 


(a) Since A C A, it suffices to show that L C A. Given (0,u) € L, 
—1 <u <1. Thus,a =sin ‘u€ [—1/2,7/2]. Let 


if 


— fork € Z*. 
a+ 27k i 


Lk = 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 155 


Notice that x, € (0, 1] and 
sin — = sind = u. 
Ba 


Thus, {(x,, sin(1/z;,))} is a sequence of points in A that converges to 
(0, u). This proves that (0, u) € A. Hence, L Cc A. 


The interval (0, 1] is path-connected and the function f : (0,1] > R, 


1 
io — oun (= is continuous. Thus, A = G; is path-connected, and 
x 


hence it is connected. Since A C S C A, Theorem 3.14 implies that S 


is connected. 


If S is path connected, there is a path + : [0,1] — S such that +(0) = 
(0,0) and yilje— (emi Let (7) — (lle) sot). Mhensq, 
[0,1] + R and 7 : [0,1] > R are continuous functions. Consider the 


sequence {x;,} with 


1 
Lk = 7 ; keZ. 


Notice that {z,} is a decreasing sequence of points in [0,1] that 
converges to 0. For each k € Zt, (ap,yx) © S if and onlly if 
pe Sia) aes ib 


Since 7; : [0,1] — R is continuous, 7,(0) = 0 and (1) = 1, 


intermediate value theorem implies that there exists t; € [0, 1] such that 
yi(t1) = x1. Similarly, there exists t2 € [0,t;] such that y(t2) = x. 
Continue the argument gives a decreasing sequence {t,} in [0,1] such 
that 71(t,) = xx for all k € Z*. Since the sequence {t,} is bounded 
below, it converges to some to in [0, 1]. Since 7 : [0,1] > R is also 


continuous, the sequence {72(t;,)} should converge to 7(to). 


Since y(t,) € S and y(t,) = xx, we must have y(t.) = yr = 
(—1)*. But then the sequence {72(t;,)} is not convergent. This gives a 
contradiction. Hence, there does not exist a path in S' that joins the point 
(0, 0) to the point (1, sin 1). This proves that S is not path-connected. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 156 


Figure 3.8: The topologist’s sine curve. 


Remark 3.2 


Example 3.14 gives a set that is connected but not path-connected. 


1. One can in fact show that S = A. 


2. To show that A is connected, we can also use the fact that if D is a 


connected subset of R”, and F : 9 — R” is a continuous function, then 


the graph of F is connected. The proof of this fact is left as an exercise. 


At the end of this section, we want to give a sufficient condition for a connected 


subset of IR” to be path-connected. 


First we define the meaning of a polygonal path. 


Definition 3.8 Polygonal Path 


Let S be a subset of IR”, and let u and v be two points in S. A path 


7 : [a,b] + S' in S that joins u to v is a polygonal path provided that 
there is a partition P = {to,t,,...,t,} of [a,b] such that for 1 <i < k, 


t= ti} 


(x; i X;_-1) 5 when 1 << t < tj. 
te ti 


y(t) = %i-1+ 


Obviously, we have the following. 


Proposition 3.15 


If S is a convex subset of IR”, then any two points in S can be joined by a 


polygonal path in R”. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 157 


Figure 3.9: A polygonal path. 


If y, : [a,c] + Aisa polygonal path in A that joins u to w, 7, : [c,b] > B 
is a polygonal path in B that joins w to v, then the path y : [a,b] ~ AU B, 


“vi (t), iat <6, 
a(t), ie<t< 8, 


is a polygonal path in A U B that joins u to v. Using this, we can prove the 
following useful theorem. 


Theorem 3.16 


Let S be aconnected subset of IR”. If Sis an open set, then any two points in 


S can be joined by a polygonal path in S. In particular, S is path connected. 


We use proof by contradiction. Supposed that S is open but there are two 
points u and v in S that cannot be joined by a polygonal path in S. Consider 
the sets 


U = {x € S | there is a polygonal path in S that joins u to x}, 


V = {x € S| there is no polygonal path in S that joins u to x}. 


Obviously u is in U and v isin V, and S = U UV. We claim that both U 
and V are open sets. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 158 


If x is in the open set S, there is an r > 0 such that B(x,r) C S. Since 


B(x,r) is convex, any point w in B(x, r) can be joined by a polygonal path 


in B(x, r) to x. Hence, if x is in U, w is in U. If x is in V, wis in V. This 
shows that if x is in U, then B(x,r) C U. If x isin V, then B(x,r) CV. 
Hence, U and V are open sets. 

Since U and V are nonempty open sets and S = U UV, they form a 
separation of S. This contradicts to S is connected. Hence, any two points 
in S can be joined by a polygonal path in S. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 159 


Exercises 3.2 


Question 1 


Determine whether the set 


A= {(en)|y=9}U { (ow) r>oy= = 


is connected. 


Question 2 


Let D be a connected subset of IR”, and let F : D — R” be a function 
defined on D. If F : D — R” is continuous, show that the graph of F, 


Gr = {(x,y) |x € D,y = F(x)} 
is also connected. 


Question 3 


Determine whether the set 
a) ee a eo ey 
is connected. 


Question 4 


Assume that A is a connected subset of R? that contains the points u = 
(0, 2,0) and v = (2, —6, 3). 


(a) Show that there is a point x = (x, y, z) in A that lies in the plane y = 0. 


(b) Show that there exists a point x = (x, y, z) in A that lies on the sphere 
ety? +27 = 25. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 160 


Question 5 


Let A and B be connected subsets of R”. If AM B is nonempty, show that 
S = AU Bis connected. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 161 


3.3. Sequential Compactness and Compactness 


In volume I, we have seen that sequential compactness plays important role in 
extreme value theorem. In this section, we extend the definition of sequential 


compactness to subsets of R”. We will also consider another concept called 
compactness. 


Let us start with the definition of bounded sets. 


Definition 3.9 Bounded Sets 


Let S be a subset of IR”. We say that S' is bounded if there exists a positive 
number /V/ such that 


\|x|| <M forall x € S. 


Remark 3.3 


Let S be a subset of R”. If S is bounded and S’ is a subset of S, then it is 
obvious that S’ is also bounded. 


Example 3.15 
Show that a ball B(xo,7") in R” is bounded. 


Solution 
Given x € B(xo,r), ||x — xo|| < r. Thus, 


III] < [soll + l]x — Xoll < [lxoll + 


Since M = ||xo|| + 7 is a constant independent of the points in the ball 
B(xXo,1r), the ball B(xo, 7) is bounded. 


Notice that if x, and x2 are points in R”, and S is a set in R” such that 
Ik —x,||<r1 for all x € S;, 


then 
\|x — Xql| < 71 + ||xe — x,|| for allx € S. 


Thus, we have the following. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 162 


Proposition 3.17 


Let S be a subset in R”. The following are equivalent. 


(a) S is bounded. 


(b) There is a point xo in R” and a positive constant MV such that 
p p 


|x — xo|| < for allx € S. 


(c) For any xp in R”, there is a positive constant / such that 


|x — Xo|| < for allx € S. 


Figure 3.10: The set S is bounded. 


We say that a sequence {x;,} is bounded if the set {x; |k € Z*} is bounded. 
The following is a standard theorem about convergent sequences. 


Proposition 3.18 


If {x;,} is a sequence in R” that is convergent, then it is bounded. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 163 


Assume that the sequence {x;,} converges to the point xo. Then there is a 
positive integer KK such that 


|_x~ — Xol| < 1 for allk > K. 


Let 
M = max{||x; — xo|||1<k< AK -1}41. 


Then © is finite and 


\|x, — Xo|| < M for all k € Zt. 


Hence, the sequence {x;,} is bounded. 


Figure 3.11: A convergent sequence is bounded. 


Let us now define the diameter of a bounded set. If S' is a subset of IR” that is 


bounded, there is a positive number / such that 
|x|] < for allx € S. 
It follows from triangle inequality that for any u and v in S, 
|[u— vl] < |lul] + [lvl] < 2M. 
Thus, the set 


Dg = {d(u, v) |u,v € S} = {|lu—v]||u,v € S} (3.1) 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 164 


is a set of nonnegative real numbers that is bounded above. In fact, for any subset 
S of IR", one can define the set of real numbers Dg by (3.1). Then S is a bounded 
set if and only if the set Ds is bounded above. 


Definition 3.10 Diameter of a Bounded Set 


Let S be a bounded subset of IR”. The diameter of 5, denoted by diam S, is 
defined as 


diam S = sup {d(u, v)|u, v € S} = sup {|lu—v|||u,v € S}. 


Example 3.16 


Consider the rectangle R = [a1,bi|] x --- x [an, bp]. If u and v are two 
points in R, for each 1 <i <n, uj, v; € [ai, bi]. Thus, 


| ew; = vil <q b; = Qj: 


It follows that 


lu — vl] < V(b = a1)? 4 + On = Gn)? 


If up = a= (a,...,@,) and vp = b = (bi,...,5,), then up and vo are in 
R, and 


l|40 — vol] = Vv (br — a1)? +--+ + (bn = Gn)? 


This shows that the diameter of R is 


diam R = ||b — all = \/(b, — a1)? +--+ + (bn — Gn). 


Figure 3.12: The diameter of a rectangle. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 165 


Intuitively, the diameter of the open rectangle U = (a1, b,) x +++ X (Qn, bn) is 
also equal to 


d= V(b, — a1)? +--+ + (bn — Gn)? 


However, the points a = (a@),...,@,) and b = (b,,...,b,) are not in U. There 
does not exist two points in U whose distance is d, but there are sequences of 
points {u,,} and {v;,} such that their distances {|]u,—vz,||} approach d as k — oo. 
We will formulate this as a more general theorem. 


Theorem 3.19 


Let S be a subset of R”. If S is bounded, then its closure S' is also bounded. 
Moreover, diam S$ = diam S. 


If u and v are two points in 9, there exist sequences {u;,} and {v;} in S 
that converge respectively to u and v. Then 


d(u,v) = jim d(Uz, Vz). (3.2) 


For each k € Z*, since u, and v; are in S, 
d(Uz, Vz) < diam S. 


Eq. (3.2) implies that 
d(u,v) < diam S. 


Since this is true for any u and v in 9, S is bounded and 
diam S < diam S. 


Since S C S, we also have diam S < diam S. We conclude that diam 5’ = 
diam S. 


The following example justifies that the diameter of a ball of radius r is indeed 
ar. 


Chapter 3. Continuous Functions on C 


onnected Sets and Compact Sets 166 


Example 3.17 


Find the diameter of the open ball B(xo, 1) in] 


Solution 


By Theorem 3.19, the diameter of the open ball B(x,,1r) is the same as 


the diameter of its closure, the clo 


sed ball CB(xo,r). Given u and v in 


C'B(xo,r), ||u — Xol| < r and ||v — xo|| < r. Therefore, 


ju — vl] < lu — xol| + lv — xol] < 2r. 


This shows that diamC'B(xo,r) < 2r. The points up = Xo + re; and 


Vo = Xo — re; are in the closed bal 


1 CB(xo,1r). Since 


||wo — vol] = ||2rer|| = 2r, 


diam C' B(xo,r) > 2r. Therefore, the diameter of the closed ball C B(xo, 1) 


is exactly 2r. By Theorem 3.19, the diameter of the open ball B(xo,1) is 


also 2r. 


Figure 3.13: The diameter of a ball. 


In volume I, we have shown that a bounded sequence in R has a convergent 


subsequence. This is achieved by using the monotone convergence theorem, 


which says that a bounded monotone 


sequence in R is convergent. For points 


in R” with n > 2, we cannot apply monotone convergence theorem, as we cannot 


define a simple order on the points in | 


IR” when n > 2. Nevertheless, we can use 


the result of n = 1 and the componentwise convergence theorem to show that a 


bounded sequence in R” has a convergent subsequence. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 167 


Theorem 3.20 


Let {u,} be a sequence in R”. If {u,} is bounded, then there is a 


subsequence that is convergent. 


Sketch of Proof 
The n = 1 case is already established in volume I. Here we prove the n = 2 


case. The n > 3 case can be proved by induction using the same reasoning. 
For k € Z*, let uz, = (az, yz). Since 


[zx] < |lus|| and [yx| < |lusll, 


the sequences {x} and {y,} are bounded sequences. Thus, there is 


a subsequence {2,,}5°, of {x,}72, that converges to a point xp in R. 
Consider the subsequence {yx,}32, of the sequence {y,}72,. It is also 
bounded. Hence, there is a subsequence ee a that converges to 


a point yo in R. Notice that the subsequence {x,, }72, of {re }Ry 
is also a subsequence of {2,,}52,. Hence, it also converges to Xp. 
By componentwise convergence theorem, {ux, }72, is a subsequence of 
{u, }?2, that converges to (xo, yo). This proves the theorem when n = 2. 


Now we study the concept of sequential compactness. It is the same as the 
n = 1 case. 


Definition 3.11 Sequentially Compact 


Let S be a subset of IR”. We say that S is sequentially compact provided 
that every sequence in S' has a subsequence that converges to a point in S. 


In volume I, we proved the Bolzano-Weierstrass theorem, which says that a 


subset of R is sequentially compact if and only if it is closed and bounded. In fact, 


the same is true for the n > 2 case. Let us first look at some examples. 


Example 3.18 


Show that the set A = {(x, y) |x? + y? < 1} is not sequentially compact. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 168 


Solution 


k; 


Then {u,} is a sequence in A that converges to the point uo = (1,0) that 


For k € Z", let 


is not in A. Thus, every subsequence of {u;} converges to the point uo, 
which is not in A. This means the sequence {u;} in A does not have a 
subsequence that converges to a point in A. Hence, A is not sequentially 


compact. 


Note that the set A in Example 3.18 is not closed. 
Example 3.19 


Show that the set C = {(z,y)|1<2<3,y > 0} is not sequentially 
compact. 


Solution 
For k € Z*, let u, = (2,k). Then {u;} is a sequence in C. If {uy, }52, is 
a subsequence of {u;,}, then ky, ko, k3,... is a strictly increasing sequence 
of positive integers. Therefore k; > j for all 7 € Z*. It follows that 


Iu, || = \|(2, k;)]| > k; = 4 for all 7 € Le 


Hence, the subsequence {u,,} is not bounded. Therefore, it is not 
convergent. This means that the sequence {u,} in C does not have a 
convergent subsequence. Therefore, C’' is not sequentially compact. 


Note that the set C' in Example 3.19 is not bounded. 
Now we prove the main theorem. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 169 


Theorem 3.21 Bolzano-Weierstrass Theorem 


Let S be a subset of R”. The following are equivalent. 
(a) S is closed and bounded. 


(b) S' is sequentially compact. 


First assume that S is closed and bounded. Let {x;,,} be a sequence in S. 
Then {x;} is also bounded. By Theorem 3.20, there is subsequence {x;,, } 
that converges to some Xo. Since S'is closed, we must have xo is in S. This 
proves that every sequence in S has a subsequence that converges to a point 
in S. Hence, S is sequentially compact. This completes the proof of (a) 
implies (b). 

To prove that (b) implies (a), it suffices to show that if S is not closed or S 
is not bounded, then S is not sequentially compact. 

If S is not closed, there is a sequence {x;,} in S that converges to a point 
Xo, but xo is not in S. Then every subsequence of {x;,} converges to the 
point xo, which is not in S. Thus, {x;} is a sequence in S that does not 
have any subsequence that converges to a point in S. This shows that S is 
not sequentially compact. 

If S is not bounded, for each positive integer k, there is a point x; in S such 
that ||x,|| > k. If {xz, ei is a subsequence of {x;,}, then ky, kz, kz,... 
is a strictly increasing sequence of positive integers. Therefore k; > 7 for 
all j € Z*. It follows that ||x,,|| > k; => j for al 7 € Zt. Hence, the 


subsequence {x;,} is not bounded. Therefore, it is not convergent. This 


means that the sequence {x;} in S' does not have a convergent subsequence. 
Therefore, S' is not sequentially compact. 


Corollary 3.22 


A closed rectangle R = [ai,b;] x --- X [an,b,] in R” is sequentially 
compact. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 170 


We have shown in Chapter | that R is closed. Example 3.16 shows that R 


is bounded. Thus, Ff is sequentially compact. 


An interesting consequence of Theorem 3.19 is the following. 


Corollary 3.23 


If S be a bounded subset of R”, then its closure S' is sequentially compact. 


Example 3.20 


Determine whether the following subsets of R® is sequentially compact. 
(a) A= {(x,y, z)|cyz = 1}. 
(b) B= {(z,y, z) |x? + 4y? + 92? < 36}. 


(One — | (rhe) | Ve 22 Sy 23) ye 


Solution 


Uu; = (i j.1) : 


Then {u;} is a sequence in A, and ||u,|| > &. Therefore, A is not 


(a) For any k € Z”, let 


bounded. Hence, A is not sequentially compact. 


(b) For any u = (z,y,z) € B, 


Jul? = 2? + 2 + 2? < 2? + dy? + 92? < 36. 


Hence, B is bounded. The function f : R®? > R, f(z,y,z) = 77+ 
Ay? + 92? is a polynomial. Hence, it is continuous. Since the set J = 
(—o00, 36] is closed in R, and B = f~'(J), B is closed in R®. Since B 
is closed and bounded, it is sequentially compact. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 171 


(c) For any k € Z", let 


1 
UU; = (1.1.3) : 


Then {u,} is a sequence of points in C' that converges to the point 
Up = (1, 1,0), which is not in C. Thus, C is not closed, and so C is not 
sequentially compact. 


The following theorem asserts that continuous functions preserve sequential 
compctness. 


Theorem 3.24 


Let D be a sequentially compact subset of IR”. If the function F : 9 > 


is continuous, then F(®D) is a sequentially compact subset of R”™. 


The proof of this theorem is identical to the n = 1 case. 


Let {y;,} be a sequence in F(D). For each k € Z*, there exists x, € 
® such that F(x,) = yz. Since D is sequentially compact, there is a 
subsequence {x,,} of {x;,} that converges to a point x in D. Since F is 
continuous, the sequence {F(x;,)} converges to F(xo). Note that F(xo) 


is in F(D). In other words, {y;,} is a subsequence of the sequence {y;} 
that converges to F(xo) in F(®). This shows that every sequence in F(D) 
has a subsequence that converges to a point in F(D). Thus, F(®D) is a 


sequentially compact subset of R”. 


We are going to discuss important consequences of Theorem 3.24 in the coming 
section. For the rest of this section, we introduce the concept of compactness, 
which plays a central role in modern analysis. We start with the definition of an 
open covering. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 172 


Definition 3.12 Open Covering 


Let S' be a subset of R”, and let. = {U,|a € J} be a collection of open 
sets in R” indexed by the set J. We say that & is an open covering of S 


en be 


acd 


provided that 


Example 3.21 


For each k € Z*, let U; = (1/k, 1). Then U; is an open set in | 


Ue: = (0,1). 


Hence, & = {U;,|k € Z*} is an open covering of the set S = (0, 1). 


Remark 3.4 


If & = {U,,| a € J} is an open covering of S and S’ is a subset of S, then 
@ ={U,,|a € J} is also an open covering of S’. 


Example 3.22 


For each k € Z*, let U, = B(0,k) be the ball in R” centered at the origin 
and having radius k. Then 


R”. 


Thus, < = {U;,| k € Z*} is an open covering of any subset S' of | 


Definition 3.13 Subcover 


Let S be a subset of R”, and let <& = {U,|a © J} be an open covering 
of S. A subcover is a subcollection of < which is also a covering of S. A 


finite subcover is a subcover that contains only finitely many elements. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 173 


Example 3.23 


For each k € Z, let Up = (k,k +2). Then |] Uy = R. Thus, of = 
k=—00 


{U;,|k € Z} is an open covering of the set S = [—3,4). There is a finite 
subcover of S' given by 


Sf = {U. aU. oe. 2, U. 1) Uo Ua, Usy- 


Definition 3.14 Compact Sets 


Let S be a subset of IR". We say that S is compact provided that every open 
covering of S has a finite subcover. Namely, if a = {U,|a € J} is an 
open covering of S, then there exist a1;,...,a,; € J such that 


Example 3.24 


The subset S = (0,1) of R is not compact. For k € Z*, let U;, = (1/k, 1). 
Example 3.21 says that % = {U;,| k € Z*} is an open covering of the set 
S. We claim that there is no finite subcollection of & that covers S. 
Assume to the contrary that there exists a finite subcollection of ./ that 
covers S. Then there are positive integers k,,..., k,, such that 


enetho-U (bs) 


j=l 


Notice that if k; < kj, then Uy, C U,,. Thus, if K = max{ky,..., km}, 
then 


and so S = (0, 1) is not contained in Ux. This gives a contradiction. Hence, 
S is not compact. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 174 


Example 3.25 


As a subset of itself, R” is not compact. For k € Zt, let U, = B(0,k) be 


the ball in IR” centered at the origin and having radius k. Example 3.22 says 


that .o& = {U;,,|k € Z*} is an open covering of R”. We claim that there is 
no finite subcover. 

Assume to the contrary that there is a finite subcover. Then there exist 
positive integers k,,..., k,, such that 


I U Ux. 
i= 


Notice that if k; < kj, then Uy, C U,,. Thus, if kK = TEC i, aha fs 
then 


Obviously, B(0, A’) is not equal to R”. This gives a contradiction. Hence, 


IR” is not compact. 


Our goal is to prove the Heine-Borel theorem, which says that a subset of R” is 
compact if and only if it is closed and bounded. We first prove the easier direction. 


Theorem 3.25 


Let S be a subset of IR”. If S is compact, then it is closed and bounded. 


We show that if S' is compact, then it is bounded; and if S is compact, then 


it is closed. 

First we prove that if S is compact, then it is bounded. For k € Z*, let 
U; = B(0,k) be the ball in R” centered at the origin and having radius 
k. Example 3.22 says that 2& = {U,|k € Z*} is an open covering of S. 
Since S' is compact, there exist positive integers k,,..., k,, such that 


VE || Oly, = Ue = BO, 1), 


j=l 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 175 


where AK = max{kj,..., km}. This shows that 
\|x|| < & for allx € S. 


Hence, S' is bounded. 

Now we prove that if S is compact, then it is closed. For this, it suffices 
to show that S C S, or equivalently, any point that is not in S' is not in S. 
Assume that x is not in S. For each k € Z", let 


1 
Vn = CUE Oe, i = {x € R” | ||x — xol| > k \ ; 


Then V;, is open in R”. If x is a point in R” and x ¥ xo, then r = ||x — 
Xo|| > 0. There is ak € Z* such that 1/k < r. Then x is in V;,. This shows 
that 


U Vi = R” \ {xo}- 


k=1 
Therefore, &% = {V,|k € Z*} is an open covering of S. Since S is 
compact, there is a finite subcover. Namely, there exist positive integers 
ky,...,km such that 


ie Ue =z. 


71 


where kK = max{ky,...,km}. Since B(xo,1/K) is disjoint from Vx, it 
does not contain any point of S. This shows that xo is not in S, and thus 
the proof is completed. 


Example 3.26 


The set 
A= {(2,y,z) | xyz = 1} 


in Example 3.20 is not compact because it is not bounded. The set 


C((a7,2))| Ws 7 2,0ay a 30 747 <4} 


is not compact because it is not closed. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 176 


We are now left to show that a closed and bounded subset of R” is compact. 


We start by proving a special case. 


Theorem 3.26 


A closed rectangle R = |ay, bi] x --- X [an, b,] in R” is compact. 


We will prove by contradiction. Assume that R is not compact, and we 
show that this will lead to a contradiction. The idea is to use the bisection 
method. 

If R is not compact, there is an open covering Y = {U,|a€e J} of R 
which does not have a finite subcover. 

Let R; = R, and let d; = diam Rj. For 1 <i < n, let a;, = a; and b;; = 
b;, and let m;,; to be the midpoint of the interval |a;1, 61]. The hyperplanes 
XL, = mia, 1 < 7 < n, divides the rectangle R; into 2” subrectangles. 
Notice that .Y is also an open covering of each of these subrectangles. If 
each of these subrectangles can be covered by a finite subcollection of open 
sets in ./, then FR also can be covered by a finite subcollection of open sets 
in &. Since we assume R cannot be covered by any finite subcollection of 
open sets in .%, there is at least one of the 2” subrectangles which cannot 
be covered by any finite subcollection of open sets in ./. Choose one of 
these, and denote it by Ro. 

Define a2, bi.2 for 1 <i < nso that 


Rol |aq9, 01a <= < (ano. Urol 


Note that 


Leo = ae forl <i<n. 


Therefore, dz = diam Ry = d,/2. 

We continue this bisection process to obtain the rectangles R,, R2,---, so 
that Ris: C Ry for all k € Z*, and R; cannot be covered by any finite 
subcollections of <7. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 177 


Figure 3.14: Bisection method. 


Define a;,x, b:,, for 1 <7 < nso that 
Hig — Gates Onl ~ (Gn, On ele 
Then for all k € Zt, 
eg) = Oe) lee forl<i<n. 


It follows that d.4; = diam Ry41 = d;,/2. 

For any 1 <i < n, {a;,}?2, is an increasing sequence that is bounded 
above by b;, and {b;,}?2, is a decreasing sequence that is bounded 
below by a;. By monotone convergence theorem, the sequence {a;,}724 


converges to a; = sup aj; while the sequence {b;,}?2, converges to 
keZt+ 
0-0 = lt (U ince 
keZt+ 


b; — a; 
Oe Ge 


3 ; ~ 9k-1- for all k € Z*, 


we find that a9. — 0,9. Letc, —a,) — 0-9, Uhen a, sc, = 0; > (tor all 
1<i<nandallk € Z*. Thus, c = (ci,...,Cn) is a point in Rx for all 
k € Z*. By assumption that ./ is an open covering of R = Rj, there exists 
B € J such that c € Ug. Since Ug is an open set, there is an r > O such 
that B(c,r) C Ug. Since 


dy 


dy = diam R;, = QR-1 


for all k € Z*, 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 178 


we find that im d, = 0. Hence, there is a positive integer K such that 
00 
dx <r. Ifx € Rr, then 


|x — cl] < diam Rx = dx <r. 


This implies that x is in B(c,r). Thus, we have shown that Rx C 
Bi(c,r). Therefore, Rx is contained in the single element Uz of , which 
contradicts to Rx cannot be covered by any finite subcollection of open 
sets in &. 

We conclude that R must be compact. 


Now we can prove the Heine-Borel theorem. 


Theorem 3.27 Heine-Borel Theorem 


Let S be a subset of R”. Then S is compact if and only if it is closed and 
bounded. 


We have shown in Theorem 3.25 that if S is compact, then it must be closed 
and bounded. 
Now assume that S is closed and bounded, and let = {U,|a € J} be 
an open covering of S. Since S is bounded, there exists a positive number 
M such that 

\|x|| < M for allx € S. 


Thus, if x = (41,..-,%n) is in S, then for all 1 < 7 < n, |z,| < ||x|| < M. 
This implies that S is contained in the closed rectangle 


R=[-M,M]x.--- x [-—M, M]. 


Let V = R"\S. Since Sis closed, V is an open set. Then </ = o/ U{V} is 
an open covering of R”, and hence it is an open covering of R. By Theorem 


3.26, R is compact. Thus, there exists B Cc A which is a finite subcover 
of R. Then 4 = 4\ {V} isa finite subcollection of Y that covers S. This 
proves that S is compact. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 179 


Example 3.27 


We have shown in Example 3.20 that the set 


B= {(x,y,z)|a* +4y* + 92? < 36} 


is closed and bounded. Hence, it is compact. 


Now we can conclude our main theorem from the Bolzano- Weierstrass theorem 
and the Heine-Borel theorem. 


Theorem 3.28 


Let S be a subset of R”. Then the following are equivalent. 
(a) S is sequentially compact. 
(b) S'is closed and bounded. 


(c) S is compact. 


Remark 3.5 


Henceforth, when we say a subset S of IR” is compact, we mean it is a 
closed and bounded set, and it is sequentially compact. By Theorem 3.19, 


a subset S' of IR” has compact closure if and only if it is a bounded set. 


Finally, we can conclude the following, which says that continuous functions 
preserve compactness. 
Theorem 3.29 


Let be a compact subset of R”. If the function F : 9 — | 
continuous, then F(®) is a compact subset of R™. 


Since D is compact, it is sequentially compact. By Theorem 3.24, F(®D) is 


a sequentially compact subset of R”. Hence, F() is a compact subset of 


R”™. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 180 


Exercises 3.3 


Question 1 


Determine whether the following subsets of R? is sequentially compact. 
(a) A= {(z,y)|2? + y? = 9}. 

(b) B= {(x, y)|0 < x? + 4y? < 36}. 

(ee 4120) 2 ON ya 

Question 2 

Determine whether the following subsets of R* is compact. 

(a A=Aaa 2) a 2. 


(b) B= {(x,y, 2) | lal + lyl + [2] < 10}. 


(c) C={(z,y,z)|4< 2? +y? +2? < 9}. 


Question 3 


Given that A is a compact subset of R” and B is a subset of A, show that 
B is compact if and only if it is closed. 


Question 4 


If S),...,S% are compact subsets of R”, show that S = S; U--- US), is 
also compact. 


Question 5 


If A is a compact subset of R™, B is a compact subset of R”, show that 
A x Bis acompact subset of R”*”. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 181 


3.4 Applications of Compactness 


In this section, we consider the applications of compactness. We are going to use 


repeatedly the fact that a subset S of IR” is compact if and only if it is closed and 


bounded, if and only if it is sequentially compact. 


3.4.1 The Extreme Value Theorem 
First we define bounded functions. 


Definition 3.15 Bounded Functions 


Let D be a subset of IR”, and let F : 9 — R”™ be a function defined on D. 
We say that the function F is bounded if the set F(®) is a bounded subset 
of R™. In other words, the function F :  — R” is bounded if there is 
positive number / such that 


|F(x)|| << M for allx € D. 


Example 3.28 


Let D = {(z,y,z)|0< 2? +y?+ 2? < 4}, and let F : D > R’ be the 
function defined as 


F(eu.2) = ( 


aaettite). 


For k € Z*, the point u, = (1/k, 0,0) is in D and 


ru) = (#2). 


Thus, ||F(u,)|| > &?. This shows that F is not bounded, even though D is 
a bounded set. 


Theorem 3.24 gives the following. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 182 


Theorem 3.30 


Let 9 be a compact subset of R”. If the function F : 9 — | 
continuous, then it is bounded. 


Proof 


By Theorem 3.29, F(®) is compact. Hence, it is bounded. 


Example 3.29 


Let D = {(z,y,z)|1 <2? +y*?4+2? <4}, andlettF : D >| 
function defined as 


F(eu.2) = ( vty +e). 


£2 + y? + 2?’ 


Show that F : © — R? is a bounded function. 


Solution 
Notice that the set D is not closed. Therefore, we cannot apply 


R? be the 


Theorem 


3.30 directly. Consider the set / = {(x,y,z)|1 <2? +y?+ 2? < 4}. For 
any u = (2,y,z) inU, |lul|| < 2. Hence, U/ is bounded. The function 


f : R® = R defined as f(z,y,z) = 2? + y? + 2? is continuous, and 
U = f-'({1,4]). Since [1,4] is closed in R, U is closed in R?. Since 
f(x,y, 2) # 0onu, 


1 
vet+y?t+ 2 


Ione) ad 


is continuous on U/. Being a polynomial function, F>(x, y,z) =x +y+t z 


is continuous. Thus, F : 2/ > R? is continuous. Since 7/ is closed and 
bounded, Theorem 3.30 implies that F : / > R? is bounded. Since D C 


U,F : D — R?’ is also a bounded function. 


Recall that if S is a subset of IR, S has maximum value if and only if S' is 


bounded above and sup S is in S; while S has minimum value if and only if S is 


bounded below and inf S is in S. 


Chapter 3. Continuous Functions 


on Connected Sets and Compact Sets 183 


Definition 3.16 Extremizer and Extreme Values 


Let D be a subset of IR”, and | 


et f :D — R bea function defined on D. 


1. The function f has maximum value if there is a point xp in D such that 


ix Fe) for all x € D. 


The point xo is called a maximizer of f; and f(xo) is the maximum 


value of f. 


. The function f has minimum value if there is a point xp in D such that 


fxo) SFG) for allx € D. 


The point xo is called a minimizer of f; and f (xo) is the minimum value 


of f. 


We have proved in volume I 


that a sequentially compact subset of R has a 


maximum value and a minimum value. This gives us the extreme value theorem. 


Theorem 3.31 Extreme Value Theorem 


Let D be a compact subset of | 


R”. If the function f : © — R is continuous, 


then it has a maximum value and a minimum value. 


By Theorem 3.24, f(D) isas 


f has a maximum value and a 


Example 3.30 


equentially compact subset of IR. Therefore, 


minimum value. 


Let D = {(z,y)|a2? +22 4+ y” < 3}, and let f : D — R be the function 


defined by 


f(a,y) =a" + ay +e *, 


Show that f has a maximum value and a minimum value. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 184 


Solution 
Notice that 


D = {(z,y) |x? +2e+y? <3} = {(2,y)|(e@ +1)? +’ < 4} 


is a closed ball. Thus, it is closed and bounded. The function f;(z,y) = 
x? +ay? and the function g(x,y) = x —y are polynomial functions. Hence, 
they are continuous. The exponential function h(x) = e*” is continuous. 
Hence, the function fo(z,y) = (ho g)(x,y) = e” is continuous. Since 


f = fi + fe, the function f : ® — R is continuous. Since D is compact, 


the function f : © — R has a maximum value and a minimum value. 


Remark 3.6 Extreme Value Property 


Let S be a subset of IR”. We say that S has extreme value property provided 


that whenever f : S — R is a continuous function, then f has maximum 
and minimum values. 

The extreme value theorem says that if S is compact, then it has extreme 
value property. Now let us show the converse. Namely, if S has extreme 
value property, then it is compact, or equivalently, it is closed and bounded. 


If S is not bounded, the function f : S — R, f(x) = ||x|| is continuous, 
but it does not have maximum value. If S is not closed, there is a sequence 
{x;,} in S that converges to a point xo that is not in S. The function g : 


S — R, g(x) = ||x — xo|| is continuous and g(x) > 0 for all x € S. Since 


jim g(Xz) = 0, we find that inf g(.S) = 0. Since xp is not in S, there is no 
—> CO 


point x in S such that g(x) = 0. Hence, g does not have minimum value. 
This shows that for S to have extreme value property, it is necessary that S 
is closed and bounded. 


Therefore, a subset S of IR” has extreme value property if and only if it is 
compact. 


3.4.2 Distance Between Sets 


The distance between two sets is defined in the following way. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 185 


Definition 3.17 Distance Between Two Sets 


Let A and B be two subsets of R”. The distance between A and B is defined 


as 
d(A, B) = inf {d(a,b)|ae A, be B}. 


The distance between two sets is always well-defined and nonnegative. If A 
and B are not disjoint, then their distance is 0. 


Example 3.31 
Let A = {(z,y)|x? + y? < 1} and let B = [1,3] x [-1,1]. Find the 


distance between the two sets A and B. 


Solution 
For k € Z*, let a, be the point in A given by 


i 
ay, = (1 = ae 0) a 
Let b = (1,0). Then b is in B. Notice that 


i! 
d(az, b) = |la, — b|| = Ee 


1 
Hence, d(A, B) < — forall k € Z*. This shows that the distance between 
A and B is 0. 


Figure 3.15: The sets A and B in Example 3.31. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 186 


In Example 3.31, we find that the distance between two disjoint sets can be 0, 
even though they are both bounded. 


Example 3.32 


Let A = {(z, y)|y = 0} and let B = {(x, y) | xy = 1}. Find the distance 
between the two sets A and B. 


Solution 
For k € Z*, let a, = (k,0) and by = (k,1/k). Then a, is in A and b, is 


in B. Notice that 1 
d(ax, be) = |lax — ball = 7. 


1 
Hence, d(A, B) < k for all k € Z*. This shows that the distance between 
A and B is 0. 


Figure 3.16: The sets A and B in Example 3.32. 


In Example 3.32, we find that the distance between two disjoint sets can be 0, 
even though both of them are closed. 

When B is the one-point set {x}, the distance between A and B is the 
distance from the point xo to the set A. We denote this distance as d(xo, A). 
In other words, 

d(x, A) = inf {d(a,xo)|a€ A}. 


If xo is a point in A, then d(x, A) = 0. However, the distance from a point xo to 
a set A can be 0 even though xg is not in A. For example, the distance between 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 187 


the point xp = (1,0) and the set A = {(z,y) |x? + y? < 1} is 0, even thought x 
is not in A. The following proposition says that this cannot happen if A is closed. 


Proposition 3.32 


Let A be a closed subset of IR” and let xp be a point in R". Then d(xg, A) = 
0 if and only if x9 is in A. 


If xo is in A, it is obvious that d(xo, A) = 0. 

Conversely, if xo is not in A, xo is in the open set R” \ A. Therefore, 
there is an r > 0 such that B(xo,r) C R” \ A. For anya € A,a ¢ 
B(xo,r). Therefore, ||xg — al| > r. Taking infimum over a € A, we find 
that d(x, A) > r. Hence, d(xo, A) 4 0. 


Figure 3.17: A point outside a closed set has positive distance from the set. 


Proposition 3.33 


Given a subset A of R”, define the function f : ] 


Hise See 


Then f is a continuous function. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 188 


We prove something stronger. For any u and v in R”, we claim that 
|f(u) — f(v)| < luv]. 


This means that f is a Lipschitz function with Lipschitz constant 1, which 
implies that it is continuous. 


Given u and v in R”, if ais in A, then 


d(u, A) < ||u—all < [|v — all + |lu—v]]. 


This shows that 
|v — all => d(u, A) — |ju—v|. 


Taking infimum over a € A, we find that 
d(v, A) > d(u, A) — |Ju—vI. 


Therefore, 
f(u) — flv) < ju—vI. 
Interchanging u and v, we obtain 


f(v) — f(a) < |ju—vlf. 


This proves that 
|f(u) — f(v)| < llu— vl. 


Now we can prove the following. 


Theorem 3.34 
Let A and C be disjoint subsets of IR”. If A is compact and C' is closed, 


then the distance between A and C is positive. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 189 


By Proposition 3.33, the function f : A — R, f(a) = d(a,C) is 
continuous. Since A is compact, f has a minimum value. Namely, there is 
a point ag in A such that 


d(ap,C) < d(a,C) for alla € A. 


For any ain A andc € C, 
d(a,c) > d(a,C) > d(ag, C). 
Taking infimum over all a € A and c € C, we find that 
d(A,C) > d(ag, C). 


By definition, we also have d(A,C) < d(ao,C). Thus, d(A,C) = 
d(ag,C’). Since A and C are disjoint and C is closed, Proposition 3.32 
implies that d(A, C’) = d(ag,C) > 0. 


An equivalent form of Theorem 3.34 is the following important theorem. 


Theorem 3.35 


Let A be a compact subset of R”, and let U be an open subset of IR” that 


contains A. Then there is a positive number 6 such that if x is a point in R” 


that has a distance less than 6 from the set A, then x is in U. 


Figure 3.18: A compact set has a positive distance from the boundary of the open 


set that contains it. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 190 


Let C = R" \ U. Then C is a closed subset of R” that is disjoint from A. 


By Theorem 3.34, 6 = d(A,C) > 0. If x is in R” and d(x, A) < 6, then x 
cannot be in C’. Therefore, x is in U. 


As a corollary, we have the following. 


Corollary 3.36 


Let A be a compact subset of IR”, and let U be an open subset of R” that 
contains A. Then there is a positive number r and a compact set K such 
that A Cc Kk C U, and if x is a point in R” that has a distance less than r 
from the set A, then x is in K. 


By Theorem 3.35, there is a positive number 6 such that if x is a point in R” 
that has a distance less than 6 from the set A, then x is in U. Take r = 6/2, 
and let 
K=V, where V = |) B(u,r). 
ucA 

Since A is compact, it is bounded. There is a positive number V/ such that 
|u|] < M for allu € A. If x € V, then there is an u € A such that 
|x — ul] < r. This implies that ||x|| < 1 +r. Hence, the set V is also 
bounded. Since /K is the closure of a bounded set, AK is compact. Since 
ACV,AC K. If w € K, since K is the closure of V, there is a point 


v in V that lies in B(w,7r). By the definition of V, there is a point u in A 
such that v € B(u,r). Thus, 


[tt UV ey ell eet 10: 


This implies that w has a distance less than 6 from A. Hence, w is in U. 
This shows that kK Cc U. 

Now if x is a point that has distance d less than r from the set A, there is a 
point u is A such that ||x — ul] < r. This implies that x € B(u,r) EV C 
Ie 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 19] 


3.4.3 Uniform Continuity 


In Section 2.4, we have discussed uniform continuity. Let D be a subset of IR” 
and let F : © — R” be a function defined on 9. We say that F : 9 — R" is 
uniformly continuous provided that for any « > 0, there exists 06 > 0 such that for 


any points u and v in 9, if ||u — v|| < 4, then ||F(u) — F(v)|| < ¢. Ifa function 


is uniformly continuous, it is continuous. The converse is not true. However, 


a continuous function that is defined on a compact subset of R” is uniformly 


continuous. This is an important theorem in analysis. 


Theorem 3.37 


Let D be a subset of IR”, and let F : 9 — R” be a continuous function 


defined on D. If D is compact, then F : 9 — R" is uniformly continuous. 


Assume to the contrary that F : © — R" is not uniformly continuous. 
Then there exists ane > 0, for any 6 > 0, there exist points u and v in 
® such that |Ju — v|| < 6 and |/F(u) — F(v)|| > ©. This implies that 
for any k € Z*, there exist uz and v; in D such that |/u, — vz|| < 1/k 
and |/F(u,) — F(v;)|| > ¢. Since D is sequentially compact, there is a 
subsequence {u,,} of {u,} that converges to a point Up in D. Consider the 
sequence {vx,} in 9. It has a subsequence {v;,, } that converges to a point 
Vo in 9. Being a subsequence of {ux, }, the sequence {ux, } also converges 
to Ug. 


Since F : ® — R"” is continuous, the sequences {F(u,, )} and {F (vz, )} 


converge to F(uo) and F(vo) respectively. Notice that by construction, 
||F(uk,,) — F(ve,,) I] = € for alll € Z*. 


Thus, |/F(uo) — F(vo)|| > ¢. This implies that F(up) 4 F(vo), and so 
Uo # Vo. 

Since k;,,kj,...is a strictly increasing sequence of positive integers, k,, > 
l. Thus, 


Il Ux, is Vi, II < 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 192 


Taking | — oo implies that up = vo. This gives a contradiction. Thus, 


F : 9 > R” must be uniformly continuous. 


Example 3.33 


Let D = (—1,4) x (—7,5] and let F : D — R° be the function defined as 


F(x,y) = (sino +y), Voy 8,6"). 


Show that F is uniformly continuous. 


Solution 
Let U = [—1,4] x [—7,5]. Then U/ is a closed and bounded subset of R? 
that contains D. The functions f;(z,y) =x+ y, fo(z,y) =x +y+8 and 
f(x,y) = xy are polynomial functions. Hence, they are continuous. If 
(x,y) €U,x > —-1,y > —Tandso fo(z,y) =x+y4+8 > 0. Thus, fo(Y/) 
is contained in the domain of the square root function. Since the square 


root function, the sine function and the exponential function are continuous 


on their domains, we find that the functions 


Fi(z,y)=sina+y), fola,y)=VJety+8, Falz,y)=e 


are continuous on U/. Since U/ is closed and bounded, F : U —> R? is 


uniformly continuous. Since D CU, F : D > R? is uniformly continuous. 


3.4.4 Linear Transformations and Quadratic Forms 


In Chapter 2, we have seen that a linear transformation T : R” — R” is a matrix 


transformation. Namely, there exists an 7m x mn matrix such that 


L(x) =Ax for all x € R". 


A linear transformation is continuous. Theorem 2.34 says that a linear transformation 
is Lipschitz. More precisely, there exists a positive constant c > 0 such that 


|| T(x) || < e|]x|| for all x € R”. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 193 


Theorem 2.5 says that when m = n, a linear transformation T : R” — R” is 


invertible if and only if it is one-to-one, if and only if the matrix A is invertible, 
if and only if det A # 0. Here we want to give a stronger characterization of a 


linear transformation T : R” — R” that is invertible. 


Recall that to show that a linear transformation T : R” — IR” is one-to-one, 


it is sufficient to show that T(x) = 0 implies that x = 0. 


Theorem 3.38 


Let T : R” — R” bea linear transformation. The following are equivalent. 
(a) T is invertible. 


(b) There is a positive constant a such that 


|| T(x) || = al|x|| for all x € 


(b) implies (a) is easy. Notice that (b) says that 


1 
\|x|| < ZIT for all x € R”. (3:3) 


If T(x) = 0, then |/T(x)|| = 0. Eq. (3.3) implies that ||x|| = 0. Thus, 
x = 0. This proves that 'T is one-to-one. Hence, it is invertible. 


Conversely, assume that 'T : R” — R” is invertible. Let 


c= ee eee 


be the standard unit (n — 1)-sphere in R”. We have seen that S”~1 is 
compact. For any u € $”',u #4 O. Therefore, T(u) 4 0 and so 
||T(u)|| > 0. The function f : S”~' + R”, f(u) = ||T(u)|| is continuous. 
Hence, it has a mimimum value at some up on S"~!. Let a = ||/T(up)||. 


Then a > 0. Since a is the minimum value of /, 


|T(u)|| => a for allu € S"7?. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 194 


Notice that if x = 0, ||T(x)|| > aJ|x|| holds trivially. If x is in R” and 
x ~ 0; let u = ax, where a = 1/||x||. Then wisin S®*, Therefore, 
|T(u)|| > a. Since T(u) = aT (x), and a > 0, we find that |/T(u)|| = 


a||T(x)||. Hence, a||T(x)|| > a. This gives 


T(x) || 2 — = allx|. 


a 
a 


In Section 2.1.5, we have reviewed some theories of quadratic forms from 


linear algebra. In Theorem 2.7, we state for a quadratic form Q,4 : R"” — R, 
Qa(x) = x" Ax defined by the symmetric matrix A, we have 


Anllx|/? < Qa(x) < Ar|lx|]? for all x € R”. 


Here \,, is the smallest eigenvalue of A, and \, is the largest eigenvalue of A. 

We have used Theorem 2.7 to prove that a linear transformation is Lipschitz 
in Theorem 2.34. It boils down to the fact that if T(x) = Ax, then ||'T(x)||? = 
x" (A? A)x, and A’ A is a positive semi-definite quadractic form. In fact, we can 
also use Theorem 2.7 to prove Theorem 3.38, using the fact that if A is invertible, 
then A’ A is positive definite. 

Let us prove a weaker version of Theorem 2.7 here, which is sufficient to 
establish Theorem 3.38 and the theorem which says that a linear transformation is 
Lipschitz. 


Theorem 3.39 


Let A be an n x n symmetric matrix, and let Q 4 : R” — R be the quadratic 
form Q(x) = x! Ax defined by A. There exists constants a and b such 
that 


al|x||? < Qa(x) < b]|x||? for all x € R”, 


Qa(u) = allul|? and Q.4(v) = b||v||? for some u and v in R”. Therefore, 


(i) if A is positive semi-definite, b > a > 0; 


(ii) if A is positive definite, b > a > 0. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 195 


As in the proof of Theorem 3.38, consider the continuous function Q, : 


S$”! _s R. Since S”~1 is compact, there exsits u and v in S"~! such that 


Qa(u) < Qa(w) < Qa(v) for all w € S™1. 


Let a = Qa(u) and b = Qa(v). If x = 0, al|x||? < Qa(x) < }]|x||? holds 
trivially. Now if x is in R” and x ¥ 0, let w = ax, where a = 1/||x||. 
Then w in on S"~'. Notice that 


Hence, 


This proves that 


allx|I? < Qa(x) < Al|x||?. 


3.4.5 Lebesgue Number Lemma 


Now let us prove the following important theorem. 


Theorem 3.40 Lebesgue Number Lemma 


Let A be a subset of 


R”, and let & = {U,.|a © J} be an open covering 


of A. If A is compact, there exists a positive number 0 such that if S is a 


subset of A and diam S < 0, then S is contained in one of the elements of 


&@. Such a positive number 6 is called the Lebesgue number of the covering 
A. 


We give two proofs of this theorem. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 196 


First Proof of the Lebesgue Number Lemma 


We use proof by contradiction. Assume that there does not exist a positive 
number 6 such that any subset S of A that has diameter less than 6 lies 
inside an open set in .%. Then for any k € Z", there is a subset S;, of A 
whose diameter is less than 1/k, but S; is not contained in any element of 
B. 

For each k € Z*, the set S;, cannot be empty. Let x; be any point in Sj. 
Then {x;} is a sequence of points in A. Since A is sequentially compact, 
there is a subsequence {x;,,,} that converges to a point Xp in A. Since H 
is an open covering of A, there exists 3 € J such that x) € Ug. Since Ug 
is open, there exists r > 0 such that B(x9,r) C Ug. Since the sequence 
{Xz,, | converges Xo, there is a positive integer V such that for all m > M, 
X;,, € B(xo,7r/2). There exists an integer 7 > M such that 1/k; < r/2. If 
xeé ane then 


1 
lx — xx, || < diam Ay, < — < =. 


Ee 2 
Since x;, € B(xXo,r/2), ||xz,; — Xo|| < r/2. Therefore, ||x — xq|| < r. This 
proves that x € B(x,r) C Ug. Thus, we have shown that A;, C Ug. But 
this contradicts to Aj, does not lie in any element of Bh | 


Second Proof of the Lebesgue Number Lemma 


Since A is compact, there are finitely many indices a1,...,@m in J such 
that 


Formic ©, ok. U,,;. Then C; is a closed set and a} Cc 
j=l 

is disjoint from A. By Theorem 3.33, the function f; : A > R, f;(x) = 

d(x, C;) is continuous. Define f : A > R by 


m 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 197 


Then f is also a continuous function. Since A is compact, there is a point 
ag in A such that 


f(ao) < f(a) for alla € A. 


m 


Notice that f;(ao) > 0 for all 1 < 7 < m. Since () C; is disjoint from 


fel 
A, there is an 1 < k < m such that ay ¢ Cy. Proposition 3.32 says that 


fe(ao) = d(ao,C,) > 0. Hence, f(ao) > 0. Let 6 = f(ao). It is the 
minimum value of the function f : A > R. 


Now let S be a nonempty subset of A such that diam S < 6. Take a point 
Xo in S. Let 1 < 1 < m bean integer such that 


filxo) > f;(Xo) for all 1 <7 <™m. 
Then 


6 < f (Xo) < filxo) = d(xo, C). 


For any u € Ci, 
d(xg,u) > d(xo, C;) > 6. 


If x € S, then d(x,x9) < diam S < 6. This implies that x is not in C). 
Hence, it must be in U,,. This shows that S is contained in U,,, which is an 
element of </. This completes the proof of the theorem. 


The Lebesgue number lemma can be used to give an alternative proof of 


Theorem 3.37, which says that a continuous function defined on a compact subset 


of 


R” is uniformly continuous. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 198 


Alternative Proof of Theorem 3.37 
Fixed ¢ > 0. We want to show that there exists 6 > 0 such that if u and v 
are in ® and ||u — v|| < 6, then ||F(u) — F(v)|| <e. 
We will construct an open covering of D indexed by J = D. Since F : 


® — R" is continuous, for each x € 9, there is a positive number 6, 
(depending on x), such that if u is in D and ||u — x|| < dx, then |/F(u) — 
F(x)|| < ¢/2. Let Ux, = B(x,d6,). Then Ux is an open set. If u and 
v are in U,, ||F(u) — F(x)|| < ¢/2 and ||F(v) — F(x)|| < ¢/2. Thus, 
|F(u) — F(v)I| <e. 


Now & = {Ux |x € D} is an open covering of D. Since D is compact, 


the Lebesgue number lemma implies that there exists a number 6 > 0 such 
that if S is a subset of D that has diameter less than 0, then S' is contained 
in one of the U,. for some x € D. We claim that this is the 6 that we need. 
If u and v are two points in D and ||u — v|| < 6, then S = {u, v} is a set 
with diameter less than 6. Hence, there is an x € D such that S Cc U,,. This 
implies that u and v are in U,.. Hence, ||F(u) — F(v)|| < ¢. This completes 
the proof. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 199 


Exercises 3.4 
Question 1 
Let D = {(x,y)|2 < x? +4y? < 10}, and let F : D > R? be the function 


defined as i , 
x Yy wu —Y 
F = ; 
(x,y) (sot 4) 
Show that the function F :D — R® is bounded. 


Question 2 


Let D = {(z,y, z)|1 < 2? + 4y? < 10,0 < z < 5}, andlet f : D > Rbe 
the function defined as 


2 —¥? 
g2 + y2 + 22 


f(x,y, 2) = 


Show that the function f : © — R has a maximum value and a minimum 
value. 


Question 3 


Let A = {(x,y) | a? + 4y? < 16} and B = {(a,y)|a+y > 10}. Show 
that the distance between the sets A and B is positive. 


Question 4 


Let D = {(z,y, z)|a? + y? + 2? < 20} and let f :D — R be the function 
defined as 


x24+4z2 


f(x,y, 2) =e 


Show that f : © — R is uniformly continuous. 


Chapter 3. Continuous Functions on Connected Sets and Compact Sets 200 


Question 5 


Let D = (—1,2) x (—6,0) and let f : © — R be the function defined as 


HGe a) = \/ Pane 7 EinG 4E 22), 


Show that f : © — R is uniformly continuous. 


Chapter 4. Differentiating Functions of Several Variables 201 


Chapter 4 
Differentiating Functions of Several Variables 


In this chapter, we study differential calculus of functions of several variables. 


4.1 Partial Derivatives 


When f : (a,b) — Risa function defined on an open interval (a, b), the derivative 


of the function at a point xo in (a, b) is defined as 


UC ae er f (xo) +h) — f (zo) 


h—0 h 


’ 


provided that the limit exists. The derivative gives the instantaneous rate of change 
of the function at the point 7). Geometrically, it is the slope of the tangent line to 
the graph of the function f : (a,b) > R at the point (xo, f(xo)). 


y 


y = yor f'(x0)(x — Xo) 


= 


y= fix) 


Figure 4.1: Derivative as slope of tangent line. 


Now consider a function f : O — R that is defined on an open subset O of 


R”, where n > 2. What is the natural way to extend the concept of derivatives to 


this function? 


Chapter 4. Differentiating Functions of Several Variables 202 


From the perspective of rate of change, we need to consider the change of f 
in various different directions. This leads us to consider directional derivatives. 
Another perspective is to regard existence of derivatives as differentiability and 
first-order approximation. Later we will see that all these are closely related. 


First let us consider the rates of change of the function f : O > Rata 
point xp in O along the directions of the coordinate axes. These are called partial 


derivatives. 


Definition 4.1 Partial Derivatives 


Let O be an open subset of IR” that contains the point xo, and let f : O — R 
be a function defined on O. For 1 < i < n, we say that the function 


f : O = R has a partial derivative with respect to its i" component at the 


point Xo if the limit 
im fo he, — 7 a) 
h—0 h 


0 
exists. In this case, we denote the limit by _ (Xo), and call it the partial 
Xi 


derivative of f : O — R with respect to x; at Xo. 


O 
We say that the function f : O — R has partial derivatives at xo if =i) 
vy 


exists for all 1 <2 <n. 


Remark 4.1 


When we consider partial derivatives of a function, we always assume that 
the domain of the function is an open set O, so that each point xp in the 
domain is an interior point of O, and a limit point of O\\ {xo}. By definition 
of open sets, there exists r > 0 such that B(xo,1r) is contained in O. This 
allows us to compare the function values of f in a neighbourhood of x9 


from various different directions. 


By definition, Da, 20) measures the rate of change of f at xo in the 
a 


direction of e;. It can also be interpreted as the slope of a curve at the 
point (xo, f(xo)) on the surface z,,,1 = f(x), as shown in Figure 4.2 


Chapter 4. Differentiating Functions of Several Variables 203 


Notations for Partial Derivatives 


0 
An alternative notation for 5 u (xo) is fir, (Xo). 
x 


Figure 4.2: Partial derivative. 


Remark 4.2 Partial Derivatives 


Let xo = (a1, G2,..., Gn) and define the function g : (—r,r) + R by 


g(h) = f (Xo ale he;) = f(a, wee Aj_-1, Qj =F he. Oy Ail. an) 


h-0 


Thus, f;, (Xo) exists if and only if g(h) is differentiable at h = 0. Moreover, 


to find f,,(xo), we regard the variables 2,...,%j-1,Vi4i,.-.,2n as 


constants, and differentiate with respect to x;. Hence, the derivative rules 
such as sum rule, product rule and quotient rule still work for partial 
derivatives, as long as one is clear which variable to take derivative, which 
variable to be regarded as constant. 


Chapter 4. Differentiating Functions of Several Variables 204 


Example 4.1 


R be the function defined as f(x,y) = x*y. Find f,(1, 2) 


Solution 


Therefore, 


Example 4.2 


Let f : R* > R be the function defined as f(x,y) = |x + y|. Determine 
whether f,,(0, 0) exists. 


Solution 
By definition, f,,(0, 0) is given by the limit 


f(h, 0) eat f(0, 0) 


if it exists. Since 


f(h, 0) — F(0, 0) 


the limit 


does not exist. Hence, f,.(0,0) does not exist. 


Chapter 4. Differentiating Functions of Several Variables 205 


Definition 4.2 


Let O be an open subset of IR”, and let f : O — R be a function defined 
on O. If the function f : O > R has partial derivative with respect to x; at 


every point of O, this defines the function f,, : O — R. In this case, we 


say that the partial derivative of f with respect to x; exists. 
If f,, : O — Rexists for all 1 < i < n, we say that the function f : O > 
has partial derivatives. 


Example 4.3 


Find the partial derivatives of the function f : R® > R defined as 


3 
f(a,y, z) = sin(vy + z) + ES; 


Solution 


Of 3 
Bq (hr ¥*) = ycos(ry + z) + aan 
bry 


Of _ 
— (x,y,z) = x cos(xy + 2) (y? 22 +1)?” 
Of 6x2 


Oy 
Bz (hey 2) = cos(xy + 2) = G+ 241? 


For a function defined on an open subset of IR”, there are n partial derivatives 


with respect to the n directions defined by the coordinate axes. These define a 


vector in IR”. 


Definition 4.3 Gradient 


Let O be an open subset of IR”, and let xp be a point in O. If the function 


f : O — R has partial derivatives at xo, we define the gradient of the 


function f at xg as the vector in R” given by 


of 


(Xo), one 


Let us revisit Example 4.3. 


Chapter 4. Differentiating Functions of Several Variables 206 


Example 4.4 


The gradient of the function f : R® — R defined as 


3X 
y2+224+1 


f(x,y, 2) = sin(xy + z) + 


in Example 4.3 is the function Vf : R? > R?, 


3 
Ae aL aL. 1 
6xy 
(ge + 22 oe ibe 
6xz 
G+ +1) 


Td <1 DiS (0 2 5) 


y cos(zy + z) + 
Vi (0.4.2) = |weos(ey + z) - 
cos(zy + z) — 
In particular, 


373 


It is straightforward to extend the definition of partial derivative to a function 


F : O > R” whose codomain is R™ with m > 2. 


Definition 4.4 


Let O be an open subset of R”, and let F : O — R” be a function defined 
on O. Given xp in O and 1 <i < n, we say that F : O > R™ has partial 
derivative with respect to x; at the point xp if the limit 


F (xo + he;) — F(x) 
h 


exists. We say that F : O — R” has partial derivative at the point xo if 


OF 
Da, 0) exists for each 1 <i < n. We say that F : O > R” has partial 
Se 


a 
derivative if it has partial derivative at each point of O. 


Since the limit of a function G : (—r,r) + R™ when h — 0 exists if and 


only if the limit of each component function G; : (—r,r) > R, 1 < j < m when 
h — 0 exists, we have the following. 


Chapter 4. Differentiating Functions of Several Variables 207 


Proposition 4.1 


Let O be an open subset of R”, and let F : O — R” be a function defined 
on O. Given xp in O and 1 <i <n, F : O — R"™ has partial derivative 


with respect to x; at the point Xo if and only if if each component function 


Fy : O + R,1 < j < m has partial derivative with respect to x; at the 
point xo. In this case, we have 


ano) = (5 


To capture all the partial derivatives, we define a derivative matrix. 


Definition 4.5 The Derivative Matrix 


Let O be an open subset of R” that contains the point x9, and let F : O > 
IR™ be a function defined on O. If F : O > R” has partial derivative at the 
point xo, the derivative matrix of F : O + R™ at xo is the m x n matrix 


When m = 1, the derivative matrix is just the gradient of the function as a row 


matrix. 


Example 4.5 


R? be the function defined as 


Fig 2) — (ie r+ 3y— 7z) F 


Find the derivative matrix of F at the point (1, —1, 2). 


Chapter 4. Differentiating Functions of Several Variables 208 


Solution 


1 3 =i 


DEG, 2) — | 


yee? Qeye* pe 


Thus, the derivative matrix of F at the point (1, —1, 2) is 


8 =16 12 
Dr(,-1.2)=[! : i, 


Since the partial derivatives of a function is defined componentwise, we can 


focus on functions f : O — R whose codomain is R. One might wonder why we 
have not mentioned the word "differentiable” so far. For single variable functions, 
we have seen in volume I that if a function is differentiable at a point, then it 
is continuous at that point. For multivariable functions, the existence of partial 
derivatives is not enough to guarantee continuity, as is shown in the next example. 


Example 4.6 


R be the function defined as 


ae 
oa ee 
. if (x,y) = (0, 0). 


Show that f is not continuous at (0,0), but it has partial derivatives at (0, 0). 


Solution 


Consider the sequence {u,,} with 


en coe 
Sie) 


It is a sequence in R? that converges to (0,0). Since 


1 
f(y) 5 for all k € Z*, 


Chapter 4. Differentiating Functions of Several Variables 209 


Figure 4.3: The function f(z, y) defined in Example 4.6. 


the sequence { f(u,)} converges to 1/2. But f(0,0) = 0 4 1/2. Since 
there is a sequence {u,,} that converges to (0,0), but the sequence { f(u,) } 
does not converge to f(0,0), f is not continuous at (0, 0). 

To find partial derivatives at (0,0), we use definitions. 

f(h,0) — f(0,0) Oa 


NS 
Se ES 


These show that f has partial derivatives at (0,0), and f,(0,0) = f,(0,0) = 
0. 


For the function defined in Example 4.6, it has partial derivatives at all points. 
In fact, when (x, y) 4 (0,0), we can apply derivative rules directly and find that 


eh (2? +y")y—2a*y  y(y? — 2?) 
Ox’ (x? + y?)? (a? + y?)?- 


Similarly, 
of _ a(x" —y”) 
ay) = Gree 


Let us highlight again our conclusion. 


Chapter 4. Differentiating Functions of Several Variables 210 


Partial Derivative vs Continuity 


The existence of partial derivatives does not imply continuity. 


This prompts us to find a better definition of differentiability, which can imply 
continuity. This will be considered in a latter section. 


When the function f : O — R has partial derivative with respect to x;, we 


obtain the function f,, : O —> R. Then we can discuss whether the function /,, 


has partial derivative at a point in O. 


Definition 4.6 Second Order Partial Derivatives 


Let O be an open subset of IR” that contains the point xo, and let f: O > R 


be a function defined on O. Given that 1 <i <n,1 <j < n, we say 
2 


that the second order partial derivative exists at Xo provided that 


Of | 


B(xo,r) — R exists, and it has partial derivative with respect to x; at 


Ox 0X; 


there exists an open ball B(xo, 7°) that is contained in O such that 


the point xo. In this case, we define the second order partial derivative 
2 


(xo) of f at xo as 


sous 
Ox ,;0X; 


Oi.on, = on 


fees(Xo + he;) = fas(Xo0) 
i : 


(xo) = lim 


We say that the function f : O — R has second order partial derivatives at 
2 


Xo provided that (x9) exists foralll <i<n,l1<j<n. 


aN 
Om OL, 


In the same way, one can also define second order partial derivatives for a 


function F : O > R” with codomain R™ when m > 2. 


Chapter 4. Differentiating Functions of Several Variables 211 


Remark 4.3 

OFF 
Ox 4 Ox; 
of assuming f,,(x) exists for all x in a ball of radius r centered at xo, it is 


In the definition of the second order partial derivative (xo), instead 


sufficient to assume that there exists r > 0 such that f,,,(xo + he;) exists 
for all |h| <r. 


Definition 4.7 
Given 1 <i <n,1 <j <n, we say that the function f : O > R has the 
2 


second order partial derivative 


f Oni 
i 
Boh, provided that ee 


(Xo) exists for 


all xp in O. 


We say that the function f : O — R has second order partial derivatives 
2 


provided that exists forall l <i<n,l<j<n. 


OL 00; 


Notations for Second Order Partial Derivatives 
Alternative notations for second order partial derivatives are 


Of 


Notice that the orders of x; and x; are different in different notations. 


Remark 4.4 


Given 1 <i <n,1 <j <n, the function f : O — R has the second 
2 


order partial derivative ——~— provided that f,, : O — R exists, and f,, 
Ox jOX; 


has partial derivative with respect to x;. 


Example 4.7 


Find the second order partial derivatives of the function f : R defined 


as 


f(a,y) = xe", 


Chapter 4. Differentiating Functions of Several Variables 212, 


Solution 
We find the first order partial derivatives first. 


of 
da 
of 


By Y) = 3re 


(yy) = Pre? — (ree, 


22+3y 


Then we compute the second order partial derivatives. 


Oa 

Ox? 
Oa 
Oyou 


(ag) Pe eee (Ae 


(x,y) = 3(1 + 2are™*Y = (3 + Gr)e™, 


(x, y) = 3e* +4 + 6re**t*¥ — (3 + 6x)e****¥, 


2x2+3y 


Definition 4.8 The Hessian Matrix 


Let O be an open subset of IR” that contains the point xo. If f: O > Risa 


function that has second order partial derivatives at xo, the Hessian matrix 
of f at xo is the n x n matrix defined as 
Gay 

ae 
sae 0) 
oy 
Fae) 
On 


H;(Xo) = 


Baia 

0x2004 “0 
arf 

OLp,OX4 (xo) 


We do not define Hessian matrix for a function F : O > R"™ with codomain 
R™ when m > 2. 


Chapter 4. Differentiating Functions of Several Variables 213 


Example 4.8 


For the function f : R* > R defined as f(x,y) = xe?*t*Y in Example 4.7, 


(4+4x)e7*t3¥ (3 + 6r)e??+3y 


(3 + 6x)e?*+3y Ore ts 


A;(z,y) = 


In Example 4.7, we notice that 


ae eee, 
OyOx ae OxOy Ost 


for all (x, y) € R*. The following example shows that this is not always true. 


Example 4.9 


Consider the function f : R? > R defined as 
co es ae 
re it (ee) = (Oh), 
0, if (x,y) = (0,0). 


Find f,,(0,0) and fiye(0, 0). 


Figure 4.4: The function f(x, y) defined in Example 4.9. 


Chapter 4. Differentiating Functions of Several Variables 214 


Solution 
To compute f,,(0,0), we need to compute f,(0,h) for all h in a 
neighbourhood of 0. To compute fy,(0,0), we need to compute f,(h, 0) 
for all h in a neighbourhood of 0. Notice that for any h € R, f(0,h) = 
f(h,0) = 0. By considering h = 0 and h ¥ 0 separately, we find that 


= ]s f(t, h) — f(0,h) _ : I 
Oe ee 


t0 


It follows that 


my full, 0) = f,(0,0) 
h 


—0 


eae 0) = I 


Example 4.9 shows that there exists a function f : IR which has 


second order partial derivatives at (0,0) but 


of of 


Remark 4.5 


If O is an open subset of IR” that contains the point xo, there exists r > 0 
such that B(xo,r) C O. Given that f : O — R is a function defined on O, 
and 1 <i < j <n, let D be the ball with center at (0,0) and radius r in 
IR?. Define the function g : D — R by 


0) —f(xae ve, ve.) 


2, 2 


0 
Then Ox,0n, exists if and only if An a 


have 


(0,0) exists. In such case, we 


Lee 
OD,05; cals 


Chapter 4. Differentiating Functions of Several Variables 215 


The following gives a sufficient condition to interchange the order of taking 
partial derivatives. 


Theorem 4.2 Clairaut’s Theorem or Schwarz’s Theorem 


Let O be an open subset of IR” that contains the point xo, and let f: O > R 
be a function defined on O. Assume that 1 < i < 7 < n, and the second 


O f Out 
: R 
Cu0T i ms and On.Or, 


oP O° 
in, - an Da, ‘ : O — R are continuous at xo, then 


order partial derivatives :O > R exist. If 


the functions 


O° f O° f 


On,0G, (xo) 7 Ox,OX ; (x0). 


Since O is an open set that contains the point xo, there exists r > 0 such 
that B(x,,r) C O. Let 


= (Get or 


and define the function g : D — R by 


G0) | xa ve, ve.) 


By Remark 4.5, g has second order partial derivatives, and 
are continuous at (0,0). We need to show that 


Og ag 
avout Ns Oudv OD) 


Consider the function 
G(u, v) = g(u, v) ae g(u, 0) 7 g(0, v) Tr g(0, 0). 


Notice that 


Chapter 4. Differentiating Functions of Several Variables 216 


aha) = g(u,v) =u; 0), S.(v) = g(u, v) — 0s De 
For fixed v with |v| < r, the function H,,(u) is defined for those u with 
|u| < Vr? — v2, such that (u,v) is in D. It is differentiable with 
_ 9g Og 


H!(u) = (u,v) - 5 OO): 


Hence, if (u,v) is in D, mean value theorem for single variable functions 
implies that there exists c,,, € (0,1) such that 


Regard this now as a function of v, the mean value theorem for single 
variable functions implies that there exists d,,, € (0, 1) such that 


2 


— (eats lal) (4.1) 


G(u, v) = uv 


Using the same reasoning, we find that for (u,v) € ©, there exists dae € 
(0, 1) such that 


G(u,v) = vS!(dyyv) =v (S20 Cen 29 (9, in.) 


Ov Ov 


Regard this as a function of u, mean value theorem implies that there exists 
Cu,v € (0,1) such that 
Og 


Gu.) — Uv >, Cusvts Cie (4.2) 


Comparing (4.1) and (4.2), we find that 


oO? O70 ~ 
peda een dav) = le, Oye 


Ovdu Oudv 


Chapter 4. Differentiating Functions of Several Variables 217 


When (u,v) + (0,0), (Cutt, due) — (0,0) and (G,,y2, dy.v) — (0,0). 
The continuities of g,,, and g,,, at (0,0) then imply that 


09 
ee a) 
Doda oD 

This completes the proof. 


Example 4.10 


Consider the function f : R? > R in Example 4.9 defined as 


Jie a) = 2 + y? ’ if (x,y) oe (0, OF 


0, Gea) (C0), 
When (x, y) 4 (0,0), we find that 


a 
a ( y)= 


Sele y) = 


y(t + dary? — y") 
(a2 + y2)2 
x(x* — 4x74? — y*) 
(a2 +y2)2 


z) 


It follows that 
omni 


eae ROPE Oe nee re _ of eA 
OyOx >” (x? + y?)§ Ofoy 


Indeed, both f,,, and fy, are continuous on R? \ {(0,0)}. 


Corollary 4.3 


Let O be an open subset of IR” that contains the point xo, and let f : O > R 
be a function defined on O. If all the second order partial derivatives of the 


function f : O — R at xo are continuous, then the Hessian matrix H;(xo) 
of f at xo is a symmetric matrix. 


Chapter 4. Differentiating Functions of Several Variables 218 


Remark 4.6 

One can define partial derivatives of higher orders following the same 
rationale as we define the second order partial derivatives. Extension of 
Clairaut’s theorem to higher order partial derivatives is straightforward. 


The key point is the continuity of the partial derivatives involved. 


Chapter 4. Differentiating Functions of Several Variables 219 


Exercises 4.1 


Question 1 


Let f : R® — R be the function defined as 


LZ 
Cais) oa a ae i” 


Find Vf (1,0, —1), the gradient of f at the point (1,0, —1). 


Question 2 


Let F : R? — R? be the function defined as 


Ea) — (eee. 3a oe dy”) ; 


Find DF(2, —1), the derivative matrix of F at the point (2, —1). 


Question 3 


Let f : R®? > R be the function defined as 
f(a, y, 2) = a? + Bayz + Qy?z?. 


Find H;(1, —1, 2), the Hessian matrix of f at the point (1, —1, 2). 


Question 4 
Let f : R? — R be the function defined as 


3 . 
‘noize “eno 


0, Me ie (OD): 
Show that f is not continuous at (0,0), but it has partial derivatives at (0,0). 


Question 5 


Let f : R? > R be the function defined as f(x,y) = |x? + y|. Determine 
whether f,,(1, —1) exists. 


Chapter 4. Differentiating Functions of Several Variables 220 


Question 6 


Let f : R? — R be the function defined as 


2 


f(e,y) = arp few 409) 


0, TGs ONE 


Show that f is continuous, it has partial derivatives, but the partial 


derivatives are not continuous. 


Question 7 


Consider the function f : R? > R defined as 


ry(x? + 9y”) 
4a? + y? 
0, il 3) COI) 


2) if (x, y) # (0, i), 


Find the Hessian matrix H (0,0) of f at (0,0). 


Chapter 4. Differentiating Functions of Several Variables 221 


4.2 Differentiability and First Order Approximation 


Let O be an open subset of IR” that contains the point xo, and let F : O > 


R’™ be a function defined on O. As we have seen in the previous section, even 


if F has partial derivatives at xo, it does not imply that F is continuous at xo. 
Heuristically, this is because the partial derivatives only consider the change of the 
function along the n directions defined by the coordinate axes, while continuity of 
F requires us to consider the change of F along all directions. 


4.2.1 Differentiability 


In this section, we will give a suitable definition of differentiability to ensure that 
we can capture the change of F in all directions. Let us first revisit an alternative 


perpective of differentiability for a single variable function f : (a,b) — R, which 


we have discussed in volume I. If xo is a point in (a,b), then the function f : 


(a, b) — R is differentiable at x9 if and only if there is a number c such that 


lim 220+ 2) — F(%0) — ch _ 
h0 h 


0. (4.3) 


In fact, if f is differentiable at x, then this number c has to equal to f’(xo). 


Now for a function F : O — R"” defined on an open subset O of R”, to 
consider the differentiability of F at xo € O, we should compare F(x,) to F (xo + 
h) for all h in a neighbourhood of 0. But then a reasonable substitute of the 


number c should be a linear transformation T : R” — R”, so that for each h in 


a neighbourhood of 0, it gives a vector T(h) in R™. As now h is a vector in R”, 


we cannot divide by h in (4.3). It should be replaced with ||h||, the norm of h. 


Definition 4.9 Differentiability 


Let O be an open subset of R” that contains the point xg, and let F : O > 
R™ be a function defined on O. The function F : O — R” is differentiable 
at Xo provided that there exists a linear transformation T : R” — R” so 
that 


F(x9 + h) — F(xo) — T(h) 
h—+0 || h|| 


IR™ is differentiable if it is differentiable at each point of O. 


lim —0)) 


Chapter 4. Differentiating Functions of Several Variables 222, 


Remark 4.7 


The differentiability of F : O — R” at xo amounts to the existence of a 


linear transformation T : R” — R” so that 


F(xq + h) = F(xo) + T(h) + e(h) [hI], 


where e(h) > 0 ash —> 0. 


The following is obvious from the definition. 


Proposition 4.4 


Let O be an open subset of R” that contains the point xo, and let F : O > 
R™ be a function defined on O. The function F : O > R" is differentiable 
at X if and only if each of its component functions F; : O > R,1 <j <m 


is differentiable at xp. 


Let the components of the function 


F(x + h) — F(x) — T(h) 
[| 


e(h) = 


be €;(h), €2(h),...,€m(h). Then for 1 < 7 < m, 


Fo +h) — Fj(%o) — Ti(h) 


els ini 


The assertion of the proposition follows from the fact that 


lim e(h) = 0 if and only if lim ¢;(h) =) forall 7 = m7, 
era 


h->0 


while lim e,;(h) = Oif and only if F; : O > R is differentiable at xo. 
— 


Let us look at a simple example of differentiable functions. 


Chapter 4. Differentiating Functions of Several Variables 229 


Example 4.11 


Let A be an m x n matrix, and let b be a point in R”’. Define the function 
F : R” > R™ by 


F(x) = Ax +b. 


Show that F : R” — R’" is differentiable. 


Solution 


Given x, and h in R”, notice that 


The map T : R” — R” defined as T(h) = Ah is a linear transformation. 
Eq. (4.4) says that 


Thus, 


Therefore, F is differentiable at xp. Since the point xo is arbitrary, the 
function F : R” + R” is differentiable. 


The next theorem says that differentiability implies continuity. 


Theorem 4.5 Differentiability Implies Continuity 


Let O be an open subset of R” that contains the point xo, and let F : 
O — RR" be a function defined on O. If the function F : O > R” is 
differentiable at xp, then it is continuous at xp. 


Since F : O — RR” is differentiable at xo, there exists a linear 
transformation T : R” — R” such that 
Il] 


e(h) = 


Chapter 4. Differentiating Functions of Several Variables 224 


By Theorem 2.34, there is a positive constant c such that 


||T (a)|| < cl] for all h € R”. 
Therefore, 
|[F (xo + h) — F(xo)|| < |T(h)|| + IHhlllle(a) || < [hl] (e + lle(a)|)) . 


This implies that 


lim F(xp + h) = F(xo). 
h-0 


Thus, F : O + R” is continuous at xo. 


Example 4.12 


The function f : R? — R defined as 


f(x,y) = eae if (x,y) # (0,0), 


0, ah eae — (10,0) 


in Example 4.6 is not differentiable at (0,0) since it is not continuous at 
(0,0). However, we have shown that it has partial derivatives at (0,0). 


Let us study the function F : R” > R™”, F(x) = Ax + b that is defined in 
Example 4.11. The component functions of F are 


Fi (21, Go,.--,2n) = 1121 + Qye%q +--+ + GinFn + by, 
Fo(21, £2,.--,2n) = Goi L1 + Ago%q +--+ + Gontn + be, 
Fin(£1, £9, --+5 En) = Ami Ly + Omg +--+ + Amntn + Om. 
Notice that 
VFi(x) = a1 = (@11, G12, ---, Ain); 
V Fo(x) = ag = (421, Go2,---, Aan) ; 


Vis) SS ig = Wiig Wyss ie) 


Chapter 4. Differentiating Functions of Several Variables 225 


are the row vectors of A. Hence, the derivative matrix of F is a given by 


Vi (x) Git 412 + ** Ain 
DF(x) _ ee = 21 ag2  **" Q2n . 
Vin (x) AGm1 Om2 °*"° Amn 


which is the matrix A itself. Observe that 


Qyihy + Qighe + +++ + Ginkn (V Fi (x), h) 

doh, + agghg + +++ + donhn V Fo(x),h 

DF(x)h = aihy 22 - 2 = (V Fo(x), h) 
Amihy ++ Amah2 a Crain CV Fin(X); h) 


From Example 4.11, we suspect that the linear transformation T : R” > R” 
that appears in the definition of differentiability of a function should be the linear 
transformation defined by the derivative matrix. In fact, this is the case. 


Theorem 4.6 


Let O be an open subset of IR” that contains the point xo, and let F : O > 
IR” be a function defined on O. The following are equivalent. 


(a) The function F : O — R" is differentiable at xo. 


(b) The function F : O > R” has partial derivatives at xo, and 


lim 
h>0 \|h|| 


=0. (4.5) 


(c) For each 1 < 7 < m, the component function F’; : O — R has partial 


derivatives at xo, and 


Fi (xo + h) — Fi(Xo) — (V Fi (Xo), h) 
h0 [hl] 


lim = (0) 


Chapter 4. Differentiating Functions of Several Variables 226 


The equivalence of (b) and (c) is Proposition 4.4, the componentwise 
differentiability. Thus, we are left to prove the equivalence of (a) and (b). 
First, we prove (b) implies (a). If (b) holds, let T : R” — R” be the linear 
transformation defined by the derivative matrix DF (x). Then (4.5) says 
that F : O > R” is differentiable at xo. 

Conversely, assume that F : O — R" is differentiable at xp. Then there 


exists a linear transformation T : R” — R”™ such that 


ae F(x + h) — F(x) — T(h) 
h>0 \|h|| 


= 0. (4.6) 


Let A be am x n matrix so that T(h) = Ah. For 1 <i < n, eq. (4.6) 
implies that 


an F (xo + he;) = F (xo) = A(he;) 


Reo h =o 


This gives 
eet | ae eo) 
h—0 h 


OF 
This shows that 5 (Xo) exists and 
vj 
OF 


a (xo) = Ae;. 


Therefore, F : O — R™ has partial derivatives at x). Since 


A= |Aer | Aes | --- | Ae, 


eq. (4.6) says that 


F(x9 + h) — F(xo) — DF(x)h Za 


This proves (a) implies (b). 


Chapter 4. Differentiating Functions of Several Variables 22] 


Corollary 4.7 


Let O be an open subset of IR” that contains the point xg, and let F : O > 
IR” be a function defined on O. If the partial derivatives of F : O + R™ 
exist at Xo, but 


F (xo + h) = F (xo) =, DF(xo)h 
[| 


#0, 


then F is not differentiable at xo. 


If F is differentiable at x), Theorem 4.6 says that we must have 


Fea F (xo ++ h) = F(x) = DF(xo)h _ 0. 
h-0 Il] 
By contrapositive, since 


F (xo + h) = F(x) = DF(xo)h 
[h| 


7 0; 
we find that F is not differentiable at xo. 


Example 4.13 


IR be the function defined as 


fae 


——. ih (ae, Os 
Fine ae (x,y) # (0,0) 
0, it (ay) — (00). 


Determine whether f is differentiable at (0,0). 


Solution 
One can show that f is continuous at 0 = (0,0). Hence, we cannot use 
continuity to determine whether / is differentiable at x9. Notice that 


f(h, 0) a f(0, 0) 


h—0 
h-0 h h-0 h = 


(030) lim = Ih 1 


r) 


Chapter 4. Differentiating Functions of Several Variables 228 


Figure 4.5: The function f(x, y) defined in Example 4.13. 


a rr re re 


Therefore, f has partial derivatives at 0, and Vf(0) = (1,0). Now we 
consider the function 


f(h) — (0) —(VF(0),h) ah 
oe Tal GR eae? 


i ea 


Let {h;,} be the sequence with h, = (;. z): It converges to 0. Since 


1 
e(h;) = “55 for all k € Z*, 


The sequence {¢(h;,)} does not converge to 0. Hence, 


f(h) — a (V f(0), h) 40. 


Therefore, f is not differentiable at (0,0). 


Example 4.13 gives a function which is continuous and has partial derivatives 
at a point, yet it fails to be differentiable at that point. In the following, we are 
going to give a sufficient condition for differentiability. We begin with a lemma. 


Chapter 4. Differentiating Functions of Several Variables 229 


Lemma 4.8 


Let xp be a point in R” and let f : B(xo,r) — R be a function defined on 


an open ball centered at xg. Assume that f : B(x ,r) — R has first order 


partial derivatives. For each h in R” with ||h|| <r, there exists z1,..., Zp 
in B(xo, 17) such that 


Pg ++ i) = F{ 


\|z; — Xo|| < ||hl| forall l <i<n. 


We will take a zigzag path from xp to xg + h, which is a union of paths 
parallel to the coordinate axes. For 1 < 7 < n, let 


xX; =xo+ > _ hex =X othe, +---+h e;. 
k=1 


Then x; is in B(xo,r). Notice that B(xo,7) is a convex set. Therefore, 
for any 1 <i < n, the line segment between x;_; and x; = x;_; + hie; 


lies entirely inside B(xo,1r). Since f : B(xo,r) + R has first order partial 


derivative with respect to x;, the function g; : [0,1] > R, 
gilt) = f (&i_1 + thye;) 
is differentiable and 


a (Gel; ar thje;). 


/ — os 


By mean value theorem, there exists c; € (0, 1) such that 


f(x) — Flu1) = gill) — 9:(0) = oflex) = (4 ey 


Chapter 4. Differentiating Functions of Several Variables 230 


i—1 
Vig = 944i che; — Xi + ) hpex + ci hje;. 
k=1 


Then z; is a point in B(xo,r). Moreover, 


n 


fea h)= f(x) = (a) fee) 


i=1 


For 1 <i < n, since c; € (0,1), we have 


lai — oll = VAP + HAP HARP < fh + +E +H? < [hI 


This completes the proof. 


Figure 4.6: A zigzag path from x9 to xp + h. 


Theorem 4.9 


Let O be an open subset of IR” that contains the point xo, and let F : O > 


R” be a function defined on O. If the partial derivatives of F : O — R” 
exists and are continuous at Xo, then F is differentiable at xo. 


Chapter 4. Differentiating Functions of Several Variables 231 


By Proposition 4.4, it suffices to prove the theorem for a function f : O > 


R with codomain R. Since O is an open set that contains the point xo, 
there exists r > 0 such that B(xo,r) C O. By Lemma 4.8, for each h that 
satisfies 0 < ||h|| <r, there exists 21, Z2,..., Zn, such that 


Hea pe 
and 
\|z; — Xo|| < ||hl| forall l <i<n. 


Therefore, 


f (Xo +h) — f(x) — (VF (Xo), hb) hi One ee 
[hl = Ul ce d 


i=1 


Fixed ¢ > 0. For 1 <i <n, since f,, : B(xo,7r) — R is continuous at xo, 
there exists 0 < 6; <r such that if 0 < ||z — xo|| < 6;, then 


\fos(2) — fas(Xo)| < =. 


Take 6 = min{d;,...,6,}. Then 6 > 0. If ||h|| < 6, then for 1 <i <n, 
\|zi — Xo|| < ||h|| < 6 < 6;. Thus, 


E 
| fc,(2i) — fr,(X0)| < =. 
nm 

This implies that 


of 


Ja) | wh [hal | F 
= ttn ~ ae 


=ll 


This proves that f is differentiable at xo. 


Chapter 4. Differentiating Functions of Several Variables 252 


Theorem 4.9 says that a function which has continuous partial derivatives is 
differentiable. This prompts us to make the following definition. 


Definition 4.10 Continuously Differentiable 


Let O be an open subset of IR”, and let F : O — R” be a function defined 


on O. We say that F : O — R" is continuously differentiable, or C’, 
provided that it has partial derivatives that are continuous. 


Theorem 4.9 says that a continuously differentiable function is differentiable. 


Analogously, we define C* for any k > 1. 


Definition 4.11 C* Functions 


Let O be an open subset of R”, and let F : O > R” be a function defined 
on O. We say that F : O > R" is k-times continuously differentiable, or 


C*, provided that it has all partial derivatives of order k, and each of them 


is continuous. 


Definition 4.12 C'@ Functions 


Let O be an open subset of IR”, and let F : O — R” be a function defined 
on O. We say that F : O > R"’ is infinitely differentiable, or C°, provided 


that it is C* for all positive integers k. 


Proposition 4.10 


Polynomials and rational functions are infinitely differentiable functions. 


Sketch of Proof 
A partial derivative of a rational function is still a rational function, which 


is continuous. 


Obviously, for any k € Z*, a C**" function is C*. 


Chapter 4. Differentiating Functions of Several Variables 233 


Remark 4.8 Higher Order Differentiability 


We can define second order differentiability in the following way. We say 


that a function F : O — R is twice differentiable at a point xo in O if there 


is a neighbourhood of x9 which F has first order partial derivatives, and 
each of them is differentiable at the point x9. Theorem 4.9 says that a C? 
function is twice differentiable. 

Similarly, we can define higher order differentiability. 


4.2.2 First Order Approximations 
First we extend the concept of order of approximation to multivariable functions. 


Definition 4.13 Order of Approximation 


Let O be an open subset of IR” that contains the point xo, and let k be a 


positive integer. We say that the two functions F: O — R™ and G: O > 


R™ are k-order of approximations of each other at x provided that 


= 0. 


Recall that a mapping G : O — R"” is a polynomial mapping of degree at 
most one if it has the form 


Aq1Xy + a12%2 sep Atntn + by 


G(x) _ 121 + Agg%2 + , +++ Gann + by oe a 
Ami Ly + Am2L2 + +++ + AmnLn + by, 
where A = [a,;| and b = (b;,...,0,,). The mapping G is a linear transformation 


if and only if b = 0. 
The following theorem shows that first order approximation is closely related 
to differentiability. It is a consequence of Theorem 4.6. 


Chapter 4. Differentiating Functions of Several Variables 234 


Theorem 4.11 First Order Approximation Theorem 


Let O be an open subset of IR” that contains the point x, and let F : O > 
R” be a function defined on O. 


(a) If F : O > R” is continuous at xo, and there is a polynomial mapping 


G : O — R” of degree at most one which is a first order approximation 
of F : O — R” at the point xo, then F : O > R"’ is differentiable at 


Xo. 


(b) If F : O — R'is differentiable at xo, then there is a unique polynomial 


mapping G : O — R" of degree at most one which is a first order 
approximation of F at xo. It is given by 


G(x) = F(xo) + DF (Xo) (x — Xo). 


First we prove (a). Assume that G : O > R" is a polynomial mapping of 


degree at most one which is a first order approximation of F : O > R™ at 


the point x9. There exists an m x n matrix A and a vector b in R™ such 
that 


G(x) = Ax +b. 
By assumption, 


)— A(xp +h) -—b 


=). 
This implies that 
lim (F(x, + h) — A(xp) + h) — b) = 0, 
h->0 


which gives 


Axo + b = lim F(x + h) = F(xo). 
h-0 


Substitute back into (4.7), we find that 


0 I/h| = 


Chapter 4. Differentiating Functions of Several Variables 235 


Since T(h) = Ah is a linear transformation, this shows that F : O > R™ 
is differentiable at xp. 

Next, we prove (b). If F : O — R" is differentiable at x9, Theorem 4.6 
says that 


lim 
h—0 ||| 
This precisely means that the polynomial mapping G : O — | 


= 0. 


G(x) = F(xo) + DF(xo0)(x — Xo), 


is a first order approximation of F : O — R"™ at xo. By definition, the 
polynomial mapping G has degree at most one. The uniqueness of G is 
also asserted in Theorem 4.6. 


Remark 4.9 


The first order approximation theorem says that if the function F : O —- 


IR” is differentiable at the point u, then there is a unique polynomial 


mapping G : O — R" of degree at most one which is a first order 


approximation of F : O — R” at the point u. The components of the 


mapping G : O > R” are given by 
Gaia ses Fi(t1,.--,Un) + 
Notice that this is a (generalization) of Taylor polynomial of order 1. 


Example 4.14 


R? be the function defined as 


F(z, y,z) = (xyz?, x2 + 2y +32), 


and let x) = (1,—1,1). Find a vector b in R? and a 2 x 3 matrix A such 
that 
= (0). 


Chapter 4. Differentiating Functions of Several Variables 236 


Solution 


The function F : R® > R? is itself a polynomial mapping. Hence, it is 
differentiable. The derivative matrix is given by 


1 2 3 


DF(x) = | 


er Mae ey 


By the first order approximation theorem, b = F(xo) = (—1, 2) and 


A=DF(1,-1,1) 


Example 4.15 


; oot ? a = = . 
Determine whether the limit lim u exists. 


(x,y) (0,0) Miles = Se 


Solution 
Let f(x, y) = e+”. Then 


Fie, y) =e, (ey) = 26, 


Ox Oy 
It follows that 


Of 
Ox 


Of 


f(0,0) =1 7 


Coe, 2400) =. 


Since the function g(x,y) = x + 2y is continuous and the exponential 
function is also continuous, f has continuous first order partial derivatives. 
Hence, f is differentiable. By first order approximation theorem, 


Of Of 
f(z,y) = f(0, 0) — ay (0, 0) RC) 0) 


lim = ()) 


(x,y) (0,0) 4/2? + y? 


Chapter 4. Differentiating Functions of Several Variables 237 


Since 


Of Of 


ae) = (0, 0) — x (0,0) a ae 0D) oa Cale pt at tae 2y, 


Oy 


we find that 
ie 
lim = 


(x,y) (0,0) Ne Se a 


0. 


4.2.3 Tangent Planes 


The tangent plane to a graph is closely related to the concept of differentiability 


and first order approximations. Recall that the graph of a function f : O — R 


defined on a subset of R” is the subset of R"** consists of all the points of the 
form (x, f(x)) where x € O. 


Definition 4.14 Tangent Planes 


Let O be an open subset of IR” that contains the point xo, and let f : O — R 
be a function defined on O. The graph of f has a tangent plane at xo if it 
is differentiable at xp. In this case, the tangent plane is the hyperplane of 


IR”*" that satisfies the equation 


nti = f(Xo) + (VF (Xo), X — Xo), where x — (Opes. 40 >): 


The tangent plane is the graph of the polynomial function of degree at most 
one which is the first order approximation of the function f at the point xo. 


Example 4.16 


Find the equation of the tangent plane to the graph of the function f : | 
R, f(x,y) = 2? + 4ry + 5y? at the point where (x, y) = (1, —1). 


Solution 
The function f is a polynomial. Hence, it is a differentiable function with 


Vif (x,y) = (22 + 4y, 4x + 10y). 


Chapter 4. Differentiating Functions of Several Variables 238 


Figure 4.7: The tangent plane to the graph of a function. 


From this, we find that V f(1,-—1) = (—2, —6). Together with f(1,—1) = 
2, we find that the equation of the tangent plane to the graph of f at the 
point where (x, y) = (1, —1) is 


g=2—2(¢-—1) —6(y+1) = —2¢ — by — 2. 


4.2.4 Directional Derivatives 


As we mentioned before, the partial derivatives measure the rate of change of the 
function when it varies along the directions of the coordinate axes. To capture 
the rate of change of a function along other directions, we define the concept of 


directional derivatives. Notice that a direction in IR” is specified by a unit vector. 


Definition 4.15 Directional Derivatives 


Let O be an open subset of IR” that contains the point xo, and let F : O > 


R’” be a function defined on O. Given a unit vector u in R”, we say that F 


has directional derivative in the direction of u at the point xo provided that 


the limit 
ieee F (xo + hu) — F(xo) 


h—0 h 


exists. This limit, denoted as DF (xo), is called the directional derivative 


of F in the direction of u at the point Xo. 


Chapter 4. Differentiating Functions of Several Variables 2a9 


When m = 1, it is customary to denote the directional derivative of f : O — R 


in the direction of u at the point xp as Du f (Xo). 


Remark 4.10 


For any nonzero vector v in R”, we can also define Dy F(x) as 


D F(x) = lim F(x as my) = F (xo) : 


However, we will not call it a directional derivative unless v is a unit vector. 


Remark 4.11 

From the definition, it is obvious that when u is one of the standard unit 
vectors €1, ..., G,, then the directional derivative in the direction of u is a 
partial derivative. More precisely, 


The following is obvious. 


Proposition 4.12 


Let O be an open subset of IR” that contains the point xo, and let F : O > 


R™ be a function defined on O. Given a nonzero vector v in R", Dy F (xo) 
exists if only if D, F(x) exists for all 1 < 7 < m. Moreover, 


SE xy Oy x a yee nl ey 


Example 4.17 


IR be the function defined as 


F(z, y) = oe 


Given that v = (vj, v2) is a nonzero vector in R?, find D, f (3, 2). 


Chapter 4. Differentiating Functions of Several Variables 240 


Solution 
By definition, 


where 
g(h) = f(8 + hay, 2 + hug) = (3 + hay)?(2 + hg). 


Since 


g'(h) = 2u,(3 + hvy)(2 + hve) + ve(3 + hv)’, 


we find that 
Di. 2) = g'(0) = 120, + Ovo. 


Take v = e; = (1,0) and v = eg = (0,1) respectively, we find that 
f,(3,2) = 12 and f,(3, 2) = 9. For general v = (vj, v2), we notice that 


Dye 27 (32 ene 


Example 4.18 


Consider the function f : R? > R defined as 


xy f 
De if (x,y) # (0,90), 


0, it (ce, (010) 


in Example 4.6. Find all the nonzero vectors v for which D, f (0,0) exists. 


Solution 


Given a nonzero vector v = (v1, v2), vu? + vs 4 0. By definition, 


il 
Dy f (0,0) m tine 


= ii , 
aaa ia h0 hv? + v3 


This limit exists if and only if v;v2 = 0, which is the case if vj = 0 or 


Vg = 0. 


Chapter 4. Differentiating Functions of Several Variables 241 


Figure 4.8: The function f(x, y) in Example 4.18. 


Example 4.19 


IR be the function defined as 


Figure 4.9: The function f(x, y) in Example 4.19. 


Chapter 4. Differentiating Functions of Several Variables 242 


Solution 
Given a nonzero vector v = (v, v2), we consider two cases. 
Case I: v; = 0. 
Then v = (0, v2). In this case, 


f(0, hv») ~ f(0,0) _ |. 
h h-0 


Dy f(0,0) = lim 
Case 2: v, # 0. 


f(huy, hve) = f(0, 0) 
h 


h?(v2 + v3) 


Dy f(0,0) = lim 


1 hve 

= lin ~-_— 
h>0 h |hvy| 
V2/ v7 + v3 


v2 


We conclude that Dy f(0,0) exists for all nonzero vectors v. 


Remark 4.12 


For the function considered in Example 4.19, by taking v to be (1,0) and 
(0, 1) respectively, we find that f,,(0,0) = 0 and f,(0,0) = 0. Notice that 


f(a) — f(0) — (VFO), h) _ he 


h>0 ||| ho |hy| 


This limit does not exist. By Corollary 4.7, f is not differentiable at (0, 0). 
This gives an example of a function which is not differentiable at (0, 0) but 
has directional derivatives at (0,0) in all directions. In fact, one can show 
that f is not continuous at (0, 0). 


The following theorem says that differentiability of a function implies existence 
of directional derivatives. 


Chapter 4. Differentiating Functions of Several Variables 243 


Theorem 4.13 


Let O be an open subset of R” that contains the point xo, and let F : O > 


R’™ be a function defined on O. If F is differentiable at xo, then for any 


nonzero vector v, D, F (xo) exists and 


(VFi (Xo), v) 


D,F (x0) = DF(xo)v = (WF a{x0),¥) 


(V Fin(Xo); v) 


Again, it is sufficient to consider a function f : O — R with codomain 
By definition, Dy f (xo) is given by the limit 


fee i (Xo + hv) — f(Xo) 


h—0 h 


if it exists. Since f is differentiable at xo, it has partial derivatives at xp and 


tim 1080+) — F(%o) — (Vf (xo), h) 


=" ()) 
h>0 \|h|| 


As h + 0, hv — 0. By limit law for composite functions, we find that 


tam £ (80+ hv) ~ F(xo) — (VF), hv) 


=) 
h->0 [Al llv 


This implies that 


Chapter 4. Differentiating Functions of Several Variables 244 


Example 4.20 


Consider the function F : R* — R? defined as F(x, y) = (x?y, xy”). Find 
D,F (2,3) when v = (—1, 2). 


Solution 
Since F is a polynomial mapping, it is differentiable. The derivative matrix 
Qry x? 


le Pe 


D,F(2,3) = DF(2,3) A = [: ‘| [: = fe , 


Theorem 4.13 can be used to determine the direction which a differentiable 


IS OL GE Ga) — . Therefore, 


function increase fastest at a point. 


Corollary 4.14 


Let O be an open subset of IR” that contains the point xo, and let f: O > R 
be a function defined on O. If f is differentiable at x) and Vf(xo) 4 0, 
then at the point x9, the function f increases fastest in the direction of 
V f (Xo). 


Let u be a unit vector. Then the rate of change of the function f at the point 
Xo in the direction of u is given by 


Dat (Xa) = (Vf (Xo), u). 


By Cauchy-Schwarz inequality, 


(Vf (Xo), a) < || VF) Illull = IV F(x), 


and the equality holds if and only if u has the same direction as V f (xo). 


Chapter 4. Differentiating Functions of Several Variables 245 


Exercises 4.2 


Question 1 


Let f : R®? > R be the function defined as 


es Y, z) a ee 


Find a vector c in R® and a constant b such that 


lar f (Xo + h) aes (c,h) Si = 
h—>0 ||| 


0, 
where xo = (3,2, —1). 


Question 2 


R® be the function defined as 


F(z, y) = (a? + 4y*, Tay, 20 + y/). 


Find a polynomial mapping G : R? — R? of degree at most one which is a 
first order approximation of F : R? > R° at the point (1, —1). 


Question 3 


Let xp = (1, 2,0, —1), and let F : R* — R? be the function defined as 


Bi@i,.2o, 23,24) — (cane, E30, + Xa, Sa -+ 20, + 1) ; 


Find a 3 x 4 matrix A and a vector b in R® such that 


F(x) — Ax —b 
(x) -Ax—b_ 


Chapter 4. Differentiating Functions of Several Variables 246 


Question 4 


Let f : R? > R be the function defined as 


i sin(2? +y)+ bay”. 


Find Dy f(1, —1) for any nonzero vector v = (vj, v2). 


Question 5 


Let f : R? > R be the function defined as 


PPA 
f(a,y) = ae if (x,y) # (0,0) 


0, it (ey (ON): 


Show that f : R? — R is continuously differentiable. 


Question 6 


Find the equation of the tangent plane to the graph of the function f : R? > 
R, f(x,y) = 4a? + 3xy — y? at the point where (x, y) = (2, —1). 


Question 7 


Let f : R? > R be the function defined as 


2 


f(z,y) = ze if (x,y) # (0,0), 


0, ih) —(O): 


(a) Show that f : R? — R is continuous. 


(b) Show that f : R? — R has partial derivatives. 


(c) Show that f : R? \ {(0,0)} > R is differentiable. 


(d) Show that f : R? — R is not differentiable at (0,0). 


(e) Find all the nonzero vectors v = (v1, v2) for which Dy f(0, 0) exists. 


Chapter 4. Differentiating Functions of Several Variables 247 


Question 8 


IR be the function defined as 
2 2 
INSEE ba 2) 
y 


0, ify = 0. 


f(z,y) = 


(a) Show that f : R? > R is not continuous at (0, 0). 


(b) Show that Dy f(0,0) exists for all nonzero vectors v. 


Question 9 


Let f : R? > R be the function defined as 


1 


AGO (2° + y*) sin — a 


, if (x,y) # (0,0), 


0, ii (Oe 


(a) Show that f : | R is differentiable at (0, 0). 


(b) Show that f : | R is not continuously differentiable at (0, 0). 


Chapter 4. Differentiating Functions of Several Variables 248 


4.3. The Chain Rule and the Mean Value Theorem 


In volume I, we have seen that the chain rule plays an important role in calculating 


the derivative of a composite function. Given that f : (a,b) — R and g : 
(c,d) — R are functions such that f((a,b)) C (c,d), the chain rule says that 
if f is differentiable at xo, g is differentiable at yo = f(x), then the composite 


function (go f) : (a,b) — Ris differentiable at x, and 


(go f)'(&o) a 9 (f(%0)) f' (20). 


For multivariable functions, the chain rule takes the following form. 


Theorem 4.15 The Chain Rule 


Let O be an open subset of R”, and let 2/ be an open subset of R*. Assume 
that F : O > R* andG : U > R™ are functions such that F(O) c U. 


If F is differentiable at x9, G is differentiable at yo = F (xo), then the 
composite function H = (Go F) : O > R" is differentiable at x, and 


DH(xo) = D(Go F)(x9) = DG(F(xo)) DF (xo). 


Notice that on the right hand side, DG(F(xo)) is an m x k matrix, DF (xo) is 
an k x n matrix. Hence, the product DG(F(x,))DF (xp) makes sense, and it is 
an m Xx n matrix, which is the correct size for the derivative matrix DH (x,). 


Let us spell out more explicitly. Assume that 


BY 455.0304 y) 
I Re og Wl ty Mag nulla eee gl eg Wane oul) 
G(Y1, Y2,-++5 Yk) 
= (Gi (yi, Yar +++ 1 Ye), Ga(Yr, Yas - ++ Ye)s + +1 G@m(Yr, Yas +++ 1 YR), 
H(21,%o,.--,%n) 


(Ti Wig ity ng yp Pio i ta 022g Wey inc ay El i, Ways 22g Me): 
Then for 1 < 7 < m, 


Hi; (v1, £2, wee Bn) 


= G(r Ways + 5 Wa) Po ay oy eg in) fg PO an) ) 


Chapter 4. Differentiating Functions of Several Variables 249 


For 1 <1<k, let 


Ui ST (Gig Wa 25 By) x 


The chain rule says that if 1 <q <n, 


OH; ue 0G OF, 
Baz, PY BB+ Fn) = Da Gy We Wedge (Bis Bsn) 
OG; OF: 
= Fy War Ve) BE (Hr Bay -- Be) 
OG; OF) 
+ Dyy Ur Yar Ue) Be (ay ay ++ En) 


OG; OF, 
+ By, Ue Ue) Bee (as, sessilis 


Namely, to differentiate H; = G, o F with respect to x,, we differentiate G with 
respect to each of the variables y;,..., y,, multiply each by the partial derivatives 
of F\,..., /, with respect to x,, then take the sum. 

Let us illustrate this with a simple example. 


Example 4.21 


Consider the function h : R? — R defined as 
h(x, y) = sin(2x + 3y) +e”. 


It is straightforward to find that 


Oh Oh 
— = 2cos(2x + 3y) + ye™, — = 3cos(2z + 3y) + re™. 


Ox Oy 


Notice that we can write h = g o F, where F : R? — R? is the function 


Big.) = (22 + 3y, 2), 


R is the function 


g(u,v) =sinu+ e”. 


Chapter 4. Differentiating Functions of Several Variables 250 


Obviously, F and g are continuously differentiable functions. 


2 3 


. ] ; Dg(u,v) = [cos e"| ‘ 


DF(z,y) = | 


Taking u = 2x + 3y and v = cy, we find that 


Dg(u,v)DF(z, y) = leos(2r + 3y) a ; | 


= 2 cos(2z + 3y) + ye™ 3cos(2x + 3y) + re™ 
= Dh(z,y). 


Now let us prove the chain rule. 


Proof of the Chain Rule 
Since F is differentiable at x, and G is differentiable at yp = F(x), 
DF(x,) and DG(yo) exist. There exists positive numbers 7; and rz such 
that B(xo,71) C O and B(yo,r2) CU. Let 
F (x9 + h) — F(xo) — DF(xp)h 
[|| 
G(yo + v) — G(yo) — DG(yo)v 


IIv| 


€)(h) = 9 he B(0,7r1), 


; v € B(O,7r2). 


€2(v) = 
Since F is differentiable at x) and G is differentiable at yo, 
lim €;(h) = 0, lim €o(v) =0. 


There exist positive constants c, and cy, such that 


\|DF(xo)h|) <c|[h|]| for all he R®, 


||DG(yo)v|| < c2||v]| for all v € R*. 


Now since F is differentiable at xg, it is continuous at x9. Hence, there 
exists a positive number r such that r <r; and F(B(xo,1r)) C B(yo, 172). 


Chapter 4. Differentiating Functions of Several Variables Zl 


For h € B(O,7r), let 
v = F(x, +h) — F(x). 
Then v € B(0,r2) and 
v = DF(xo)h + ||hllei(h). 
It follows that 
IIvl| < |} DFo)hl] + ||hI||le1(4) |] < [ll (cr + fler(h) |) - 
In particular, we find that when h > 0, v > 0. Now, 


H(xp + h) — H(xo) 

= G(F(x + h)) — G(F(x»)) 

= G(yo + v) — G(yo) 

= DG(yo)v + |Ivlle2(v) 

= DG(yo)DF(xo)h + |[h|[DG(yo)ei(h) + ||v]le2(v). 


Therefore, for h € B(0,r) \ {0}, 


H(xo + h) — W(x) — DG(yo) DF (xo) h 
[ha 


IIvl| 


= DG(yo)éei(h) + my”. 


This implies that 


| H(xo + h) — H(xo) — DG(yo) DF (xo)h | 
ca 


< |[DG(yo)e1(h)|| + my 2) 


< eglle1()|| + (er + [ler(b)I)) llea(v) IL. 


Since v — O when h -> 0, we find that e2(v) — O when h — 0. Thus, 
we find that 
inten H(xo + h) = H(xo) cam DG(yo)DF(xo)h 


= 0. 
h>0 \|h|| 


Chapter 4. Differentiating Functions of Several Variables 252 


This concludes that H is differentiable at xg and 


DH (xo) = DG(yo)DF (xo). 


Example 4.22 


R? be the function defined as 


F(a, y, z) = (u* + 4y? + 927, xyz). 


Find a vector b in R? and a 2 x 3 matrix A such that 


i F(Qu+v,v+w,u+w)—b-— Ap 
1m 
(uvw)(1-1,0)  \/(u— 12 + (v +1)? +? 


=), where p— 


Solution 
Let po = (1, —1, 0), and let G : R® + R? be the mapping 


G(u,v,w) = (Qut+v,vt+u,ut+w). 


Then H(p) = H(u, v, w) = FQut+v,v+w,ut+w) = (Fo G)(u,v, wv). 
Notice that F and G are polynomial mappings. Hence, they are infinitely 
differentiable. To have 
H(v) —b— Ap 

IP — Poll 

F(Qu+v,v+w,u+w)—b-— Ap 

lim =0 
(uvw)(L-10)  \/(w— 1)? + (v + 1)? + wv? 


’ 


the first order approximation theorem says that 


b+ Ap = H(po) + DH (po) (p — po). 


Therefore, 
A=DH(po) and b=H(po) — Apo. 


Chapter 4. Differentiating Functions of Several Variables 253 


H(po) = H(1, ~~ 


2 1 
DGEGrr, 0) — 01 
1 0 


By chain rule, 


R be the function defined as 


Solution 
R be the function 


G(X) = |x|? = a7 +ap+---+ 2%. 


R”) = [0, co), and g(x) = 0 if and only if x = 0. 


Chapter 4. Differentiating Functions of Several Variables 254 


Since g is a polynomial, it is infinitely differentiable. Let h : [0,co) > R 
be the function h(u) = u®/?. Then h is differentiable on (0,00). Since 
f(x) = (ho g)(x), chain rule implies that for all xo € R” \ {0}, f is 


differentiable at xp. 


Now consider the point x = 0. Notice that for 1 <i < n, f,,(O) exists 
provided that the limit 


exists. This is the case if ~ > 1. Therefore, f is not differentiable at x = O 
ifa < 1. If a > 1, we find that f,,(0) = 0 for all 1 <i < n. Hence, 
V f(0) = 0. Since 


fan £2) = £(0) ~ (VF(0),h) 
h>0 ||| 


= lim ||h||°-! = 0, 
h-0 


we conclude that when a > 1, f is differentiable at x = 0. 
Therefore, f is differentiable if and only if a > 1. 


Example 4.24 


R? — R be a twice continuously differentiable function, and let 
IR be the function defined as 


g(r, 9) = f(rcos6,rsin 6). 


Show that 
Og , 189 Ona emO ry Oat 
Or? r Or 72062 Ax? Oy?" 


Solution 


IR? be the mapping defined by 


H(r, 6) = (rcos6,rsin6@). 


Chapter 4. Differentiating Functions of Several Variables 255 


Then H is infinitely differentiable, and g = f oH. Let x = H,(r,0) = 
rcos@ and y = Ho(r,@) = rsin 0. By chain rule, 
of 


Og Of On 7 0) Cl ee eal 
Or Oxdr  OyOr oe ey 
Og Of Ox Of Oy _ 
00 Ox 00 © 


—r sine + erate, 


Oy 00 Ox Oy 


Using product rule and chain rule, we then have 


Og _ P Ce Ori om. fh On nOL Ca! OF 
~ 8” \ 8x? Or OyOx Or ) ' OxOy Or ° Oy? Or 


Or? 
Since f has continuous second order partial derivatives, fry, = fyz- 
Therefore, 


Og One 


By 
—_” = cos? 9—— + 2sin 8 cos@ 


Or? Ole OxOy 


Similarly, we have 


OF eh Cay ie thi) a Leen Of Ox , Of dy 
an On? 80 Ayr OO 8" \ Bxdy 00 | By? 80 


iP gf singe 
oe Ox Oy 
2 2 
= 7? sin? 94 — 2r* sin 8 cos 0 bet pee cel - ag 
c 


r—. 
OxOy Oy? Or 
From these, we obtain 


02 1 1 2 2 2, 
Cy ROD Oy OnE Or 
On Ort O0-Or 6 Ou 


Example 4.24 gives the Laplacian 


of f in polar coordinates. It is customary that one would abuse notation and write 
g = f, so that the formula takes the form 

OF oy oF _lof 1 OT 

Ox? Oy2 Or? | or Or | 2 AG? 


Chapter 4. Differentiating Functions of Several Variables 256 


Remark 4.13 


We can use the chain rule to prove Theorem 4.13. Given that O is an open 


subset of IR” that contains the point x9, and F : O — R" is a function 


that is differentiable at xo, we want to show that D,F(xo) exists for any 


nonzero vector v, and 
D,F (xo) = DF(xo)v. 


Since O is an open set that contains the point xo, there is an r > 0 such that 
B(xo,r) C O. By definition, 


F(x + © — F(xo) — p'(0), 


D,F (xo) = lim 


where g : (—r,r) — R" is the function g(h) = F(xo + hv). Lety : 
(—r,r) — R” be the function defined as y(h) = xp + hv. Then ¥ is a 


differentiable function with y/(h) = v. Since g = F 0 ¥, and y(0) = Xo, 
the chain rule implies that g is differentiable at h = 0 and 


g’(0) = DF (xo)7(0) = DF (x)v. 


This completes the proof. 


Definition 4.16 Tangent Line to a Curve 


A curve in R” is a continuous function + : [a,b] + R”. Let co be a point in 


(a,b). If the curve + is differentiable at co, the tangent vector to the curve 


+y at the point y(co) is the vector y'(co) in R”, while the tangent line to the 


curve ‘+y at the point -y(co) is the line in R” given by x : R > R", 


ay eq) tay (en): 


Remark 4.14 Tangent Lines and Tangent Planes 


Let O be an open subset of IR” that contains the point xo, and let f : O — R 


be a function that is differentiable at xp. We have seen that the tangent plane 
to the graph of f at the point (x9, f(xo)) has equation 


Chapter 4. Differentiating Functions of Several Variables 257 


Tn+1 = (Xo) + (VF (Xo), x — Xo). 
Now assume that r > 0 and y : (—r,r) + R"*? is a differentiable curve 
in R"*? that lies on the graph of f, and 7(0) = (xo, f(xo)). For all t € 


(—r,r), 


Yn+i(t) = f(y (t),---,%n(2)). 
By chain rule, we find that 
Yn41(0) = (VF (Xo), Vv), where v = (7;(0),..-,%,(0)). 


The vector w = (v,7/,,,(0)) is the tangent vector to the curve + at the 
point (xo, f(xo)). The equation of the tangent line is 


(vi(t),..-,@n(t), tn4i(t)) = (Xo, F(Xo)) + #(71,(0), ---%a(0)s Mn41 (0). 


Thus, we find that 
(xi(t),...,%n(t)) = x(t) = Xo + tv, 
and 


Tit — f <p) 47.4910): 


These imply that 


n4i(t) = f(%o) + UVF (Xo); v) 
f (xo) + (Vf (%0), x) — Xo). 


Thus, the tangent line to the curve + lies in the tangent plane. 

In fact, the tangent plane to the graph of a function f at a point can be 
characterized as the unique plane that contains all the tangent lines to the 
differentiable curves that lie on the graph and passing through that point. 


Now we turn to the mean value theorem. For a single variable function, the 


mean value theorem says that given that f : J — R is a differentiable function 


defined on the open interval J, if x) and xp + h are two points in J, there exists 


Chapter 4. Differentiating Functions of Several Variables 258 


c € (0, 1) such that 
f(zo +h) — f(xo) = hf’ (xo + ch). 


Notice that the point x9 + ch is a point strictly in between xp and x + h. To 
generalize this theorem to multivariable functions, one natural question to ask is 


the following. If F : O — R” is a differentiable function defined on the open 


subset O of IR", xp and x9 + h are points in O such that the line segment between 
them lies entirely in O, does there exist a constant c € (0,1) such that 


F(x + h) — F(xo) = DF (xo + ch)h? 
When m > 2, the answer is no in general. Let us look at the following example. 


Example 4.25 


Consider the function F : R? — R? defined as 


F(x, y) = (x*y, xy). 
Show that there does not exist a contant c € (0,1) such that 
F(x + h) — F(x,) = DF (xo + ch)h, 


when xo = (0,0) and h = (1, 1). 


Solution 
Notice that 


DF(z, y) = | 


Qry x? 
When xp = (0,0) andh = (1,1), x9 +ch = (c, c). If there exists a constant 
c € (0, 1) such that 


F(x 9 + h) — F(xy) = DF (xo + ch)h, 


Chapter 4. Differentiating Functions of Several Variables 259 


This gives 


37? =1 and 2c=1. 


But 2c = 1 gives c= 1/2. When c = 1/2, 3° = 3/4 F 1, Hence, no such 


c can exist. 


However, when m = 1, we indeed have a mean value theorem. 


Theorem 4.16 The Mean Value Theorem 


Let O be an open subset of IR”, and let xp and xg + h be two points in O 


such that the line segment between them lies entirely in O. If f: O ~ R 


is a differentiable function, there exist a constant c € (0, 1) such that 


Flo +h) — flo) = (WF + oh) b) = S72 xy + oh). 


Define the function y : R by y(t) = Xo + th. Then ¥ is a 
differentiable function with 7/(t) = h. Letg = (foy): [0,1] 9 R. 
Then 


g(t) = (fo 7)(t) = F(x + th). 
Since f and y are differentiable, the chain rule implies that g is also 
differentiable and 


g (t) = (VF (xo + th), 7'@)) = (Vf (xo + th), h). 


By mean value theorem for single variable functions, we find that there 
exists c € (0,1) such that 


Chapter 4. Differentiating Functions of Several Variables 260 


In other words, there exists c € (0, 1) such that 


f (Xo =P h) = f (Xo) = (Vf (xo + ch) h). 


This completes the proof. 


As in the single variable case, the mean value theorem has the following 
application. 


Corollary 4.17 


Let O be an open connected subset of IR", and let f : O + R be a function 
defined on O. If f is differentiable and V f(x) = 0 for all x € O, then f is 


a constant function. 


If u and v are two points in O such that the line segment between them lies 
entirely in O, then the mean value theorem implies that f(u) = f(v). 


Since O is an open connected subset of IR”, Theorem 3.16 says that any 
two points u and v in O can be joined by a polygonal path in O. In other 
words, there are points x9, x ,..., x, in O such that x) = u, x; = v, and 
for 1 <i < k, the line segment between x;_, and x; lies entirely in O. 
Therefore, 

i = lee forall 1 <2. < k. 


This proves that f(u) = f(v). Hence, f is a constant function. 


Chapter 4. Differentiating Functions of Several Variables 261 


Exercises 4.3 


Question 1 


R? — R? be the function defined as 


F(z,y) = (2? +y*,y,2+y). 
Find a vector b in R® and a3 x 2 matrix A such that 


i F(5u + 3v,u — 2v) —b— Aw 
im 
(u,v)(1,-1) J(u-1)?+(v +1)? 


= 0, where w = A : 
UV 


Question 2 


Let ¢: R > Randw : R > R be functions that have continuous second 
order derivatives, and let c be a constant. Define the function f : R? > R 
by 


f(t,x) = (a + ct) + V(x —- ct). 

Show that 
ieee 
O° Oa2 


Question 3 


Let a be a constant, and let f : R” \ {0} — R be the function defined by 
f(x) = IIxlI°- 


Find the value(s) of a such that 


Chapter 4. Differentiating Functions of Several Variables 262 


Question 4 


Let f : R? > R bea function such that f(0,0) = 2 and 


Of 0 
5 ee) =11 and ; for all (x,y) €] 


Show that 


f(z,y) =2+11z — Ty for all (x, y) €] 


Question 5 


Let O be an open subset of R?, and let u: O + Randv: O — R be twice 


continuously differentiable functions. Define the function F : O > R? by 


F(x,y) = (u(x, y), o(@, y)). 


Let U/ be an open subset of R? that contains F(Q), and let f : ¢/ > Rbea 
twice continuously differentiable function. Define the function g : O — R 
by 


g(z,y) =(fo F)(z,y) a flu(a, y), v(z, y)). 


Find gz2, Jxy and gy, in terms of the first and second order partial derivatives 
of u,v and f. 


Chapter 4. Differentiating Functions of Several Variables 263 


4.4 Second Order Approximations 


In this section, we turn to consider second order approximations. We only consider 


a function f : O — R defined on an open subset O of IR” and whose codomain 


is R. The function is said to be twice differentiable if it has first order partial 


derivatives, and each f,, : O > R, 1 <i <n, is a differentiable function. Notice 
that a twice differentiable function has continuous first order partial derivatives. 
Hence, it is differentiable. The differentiability of each f,,, 1 <7 < nalso implies 
that f has second order partial derivatives. 


Lemma 4.18 


Let O be an open subset of IR”, and let f : O — Rbe a twice differentiable 
function defined on O. If x9 and xp + h are two points in O such that the 
line segment between them lies entirely in O, then there is ac € (0,1) such 
that 


f(x +b) — fl%0) — (Wf (xo),h) = Sh" Hy(x% + ch) 


SS + ch) 
2 : Ox Ox; = ee 


i=1 j=l 


Given xp € O, let r be a positive number such that B(x9,7r) C O. Define 


the function g : (—r,r) — R by 


Ge) (ee tai) 


Since f : O > R is differentiable, chain rule implies that g : (—r,r) > | 
is differentiable and 


— 2 is (xo + th) = (Vf (xo + th), h). 


Since each f,, : O — R, 1 <2 < nis differentiable, chain rule again 
implies that g’ is differentiable and 


Chapter 4. Differentiating Functions of Several Variables 264 


" n nm ra) 
j a 


i=1 j=l 


By Lagrange’s remainder theorem, there is ac € (0, 1) such that 


a(1) — 9(0) — g (0)(4 - 0) = £2 - oy. 


This gives 


f(%o +h) — f(Xo) — (Vf (xo), h) = = 


If a function has continuous second order partial derivatives, then it is twice 
differentiable, and Clairaut’s theorem implies that its Hessian matrix is symmetric. 


For such a function, we can prove the second order approximation theorem. 


Theorem 4.19 Second Order Approximation Theorem 


Let O be an open subset of IR” that contains the point xo, and let f : O > R 


be a twice continuously differentiable function defined on O. We have the 
followings. 


F(x +) ~ fo) — (WF (oh) — Sn” Hy (xo) 


=) 
|[la||? 


@) Jy 
(b) If Q(x) is a polynomial of degree at most two such that 


_ I (Xo + h) = Q(X + h) 
i IE 


= (i). 


then 


Q(x) = F (xo) + (WF (Xo), x— Xo) +5 (0) Hy (20) (ea). (4.8) 


Combining (a) and (b), the second order approximation theorem says that for 
a twice continuously differentiable function, there exists a unique polynomial of 


degree at most 2 which is a second order approximation of the function. 


Chapter 4. Differentiating Functions of Several Variables 265 


Let us prove part (a) first. Since O is open, there is an r > 0 such that 
B(xo,r) C O. For each h in R” with ||h|| < r, Lemma 4.18 says that there 
is acy € (0,1) such that 


Fx +b) — f%0) — (Vf loco),h) = 5" Hy (oco + ehh. 


Therefore, if 0 < ||h|| <r, 


(eo +h) ~ f (0) — (WF (0), In) — Sn” Hy (xo) 
ii 


arf 
nas (gos BERS) ae pera 


Onn OF 
On,08, (Xo a oa) i OME TE Nee oy 


On 
On G, ) 


adn, ho + Hh) — 


Since ch € (0,1), lim (xo +chh) = xp. Foralll <i<n,1l<j<n, 
—> 
fx;x; 18 continuous. Hence, 


eo ony 
l in| = ; 
Ren Ono Pogo) Ox OX; (%0) 


This proves that 


7 f (Xo + h) — f(xo) — (Vf (xo), bh) — sh? H(xo)h 


h—0 \|h||? 


To prove part (b), let 


P(x) = f (xo) + (Vf (x0), X — Xo) + . 


Part (a) says that 


Chapter 4. Differentiating Functions of Several Variables 266 


Since Q(x) is a polynomial of degree at most two in x, Q(x + h) is a 
polynomial of degree at most two in h. Therefore, we can write Q(x + h) 
as 

n 1 n 

i=1 i=1 1<i<j<n 


Since 
ie f (Xo + h) — Q(xo + h) 
h>0 \|Ih||? 


subtracting (4.9) gives 


= 0, 


fen P(x + h) — Q(x + h) 
mt TAP 


= ii) (4.10) 


It follows that 


h-0 


and 
eae P(x + h) — Q(x + h) 


h—>0 \|h|| 


= |), (4.12) 


Since f has continuous second order partial derivatives, Seja; (xo) = 
lise eye Wes 


P(Xo + h) — Q(x + h) 


Eq. (4.11) implies that c = f(xo). Then eq. (4.12) implies that 


oy a (Xo) forall l <i<n. 


Finally, (4.10) implies that for any 1 <2 <7 <n, 


es Bee ) 
i Or,0n, a 


Qi; 


This completes the proof that Q(x) = P(x). 


Chapter 4. Differentiating Functions of Several Variables 267 


Example 4.26 


Find a polynomial Q(x, y) of degree at most 2 such that 


lam Satan — 9") — O(@,y) 


='()) 
(e,y)30,2) (x2 — 1)? + (y — 2) 


Solution 
Since g(x,y) = 4a? — y? is a polynomial function, it is infinitely 
differentiable. Since the sine function is also infinitely differentiable, the 
function f(x,y) = sin(4x? — y’) is infinitely differentiable. 


fo(z,y) = 8xcos(4e*—y’),  —fy(x, y) = —2y cos(4a” — y’), 
foa(x,y) = 8cos(4x? — y*) — 6427 sin(4z? — y”), 
foy(2,y) = fyo(2, y) = 16zysin(42* — y*), 
tuy(2,Y) = —2.cos(4a? — y*) — 4y? sin(4x? — y?). 
Hence, 
FIL) =0, HOLD =s WO eS —n 
fao(1,2) = 8, fay(1,2) =0, fyy(1,2) = —2. 


By the second order approximation theorem, 


Q(a,u) = F(0,2) + f(t, 2)(0— 1) + Full, 2)(y— 2) + 5 fee(1, 2)(@ — 1)? 


+ fay(1,2)(8 = I(y — 2) + 5 fyv( 20 y 2) 
= 8(# - 1) — 4(y-2) + 4(@ - 1)? ~~ 2) 
a 


Example 4.27 


OSH ee 
Determine whether the limit lim § ———> 4 
(ey) ay 


exists. If yes, find 
the limit. 


Chapter 4. Differentiating Functions of Several Variables 268 


Solution 
Since the exponential funtion and the function g(x, y) = x+y are infinitely 
differentiable, the function f(x,y) = e** is infinitely differentiable. By 
the second order approximation theorem, 


lim = 0, 
(.y)— (0,0) oe 


where 


0 0 
Q(x.) = £00,0) + 25£(0,0) +y¥4(0,0) 
1 Of af 1 
+ 5° 53 (0:0) + ag OO) 5 


y’ 

y 
= Shey = Few =o 
~ Og2 8) = Ba ay 4 ~ 


Thus, 


0 0 Oe 
F(0,0) = 50,0) = 0,0) = =4(0,0) = 


It follows that 


1 1 
Q(z,y)=1l+et+y+ 5a + ay t sy. 


Hence, 
1 


1 
Pre i] = =, pen ated ee me? 
e xr—-yYy 5% xy 94 


lim 
(x,y)—>(0,0) r+ y? 


ety _]—gr-y 
lim = @ 
(x,y) (0,0) a? ye 


exists, subtracting (4.13) shows that 


2 2 
(.y)—> (0,0) ope 


1 1 
=a? + a2y+—y? 


C= mm fae Where sna) — 


Chapter 4. Differentiating Functions of Several Variables 269 


This implies that if {w;,} is a sequence in R? \ {0} that converges to (0,0), 


then the sequence {h(w;,)} converges to a. For k € Z*, let 


ce ne ae. 
uy, = ie ’ Vin Gor 6 


Then {u,} and {v;,} are sequences in R? \ {0} that converge to (0,0). 


Hence, the sequences {h(u;)} and {h(v;,)} both converge to a. Since 


1 
Flay) — a Hoyle for allk € Z*, 


at 
2? 
converges to 1. This gives a contradiction. Hence, the limit 


the sequence {h(u,)} converges to while the sequence {h(v;)} 


ety __]—gr-y 
lim 5 5 
(ey)3(00) a2 +y 


does not exist. 


Chapter 4. Differentiating Functions of Several Variables 270 


Exercises 4.4 


Question 1 


Let f : R? > R be the function 
f(x,y) = ay + day’. 


Find a polynomial Q(z, y) of degree at most 2 such that 


' f(z.y)-Qey) _ 
(ey),-1) (x — 1)? + (y+ 1)? 


Question 2 


Tee RRI nee Tettier nen reitm ee en ee 
(x,y)—(0,0) x2 aa Ue 


exists. If yes, 


find the limit. 


Question 3 


—1 
Determine whether the limit i exists. If yes, find 


the limit. 


Chapter 4. Differentiating Functions of Several Variables 271 


4.5 Local Extrema 


In this section, we use differential calculus to study local extrema of a function 
f : O — R that is defined on an open subset O of R”. The definition of local 
extrema that we give here is only restricted to such functions. 


Definition 4.17 Local Maximum and Local Minimum 


Let O be an open subset of IR” that contains the point xo, and let f : O > | 
be a function defined on O. 


1. The point xp is called a local maximizer of f provided that there is a 
5 > 0 such that B(xo,d) C O and for all x € B(xo, 6), 


f(x) < f (Xo). 
The value f (xo) is called a local maximum value of f. 


. The point xo is called a local minimizer of f provided that there is a 
5 > 0 such that B(xo,5) C O and for all x € B(xo, 6), 


f(x) 2 f(%o). 
The value f (xo) is called a local minimum value of f. 


. The point xp is called a local extremizer if it is either a local maximizer 
or a local minimizer. The value f(x) is called a local extreme value if 
it is either a local maximum value or a local minimum value. 


From the definition, it is obvious that xg is a local minimizer of the function 


f : O > Rif and only if it is a local maximizer of the function —f : O > R. 


Example 4.28 


(a) For the function f : R, f(x,y) = x? + y’, (0,0) is a local 
minimizer. 


(b) For the function g : ] R, g(x,y) = —x? — y?, (0,0) is a local 


maximizer. 


Chapter 4. Differentiating Functions of Several Variables 272 


(c) For the function h : R? > R, h(z,y) = x? — y”, 0 = (0,0) is neither 
a local maximizer nor a local minimizer. For any 6 > 0, let r = 6 i De 
The points u = (r,0) and v = (0,7) are in B(O, 6), but 


h(v) =—r? <0=h(0). 


z=f(xy) z=2(,y) z=h (x,y) 


Figure 4.10: The functions f(x,y), g(x,y) and h(x, y) defined in Example 4.28. 


The following theorem gives a necessary condition for a point to be a local 
extremum if the function has partial derivatives at that point. 


Theorem 4.20 


Let O be an open subset of IR” that contains the point xo, and let f : O — R 


be a function defined on O. If xo is a local extremizer and f has partial 
derivatives at xo, then the gradient of f at xo is the zero vector, namely, 
V f (xo) = 0. 


Without loss of generality, assume that xo is a local minimizer. Then there 
isad > 0 such that B(xo, 6) C O and 


ese Ca for all x € B(xo, 6). (4.14) 


For 1 < i < n, consider the function g; : (—d,6) — R defined by g;(t) = 
f (xo + te;). By the definition of partial derivatives, g; is differentiable at 
t = 0 and 


Chapter 4. Differentiating Functions of Several Variables 215 


Eq. (4.14) implies that 


Oe 9,(0)) for all t € (—d, 0). 


In other words, t = 0 is a local minimizer of the function g; : (—6,6) > R. 
From the theory of single variable analysis, we must have g/(0) = 0. Hence, 
fe;(Xo0) = 0 for all 1 <i <n. This proves that V f (xo) = 0. 


Theorem 4.20 prompts us to make the following definition. 


Definition 4.18 Stationary Points 


Let O be an open subset of IR” that contains the point xo, and let f : O — R 
be a function defined on O. If f has partial derivatives at x9 and V f(xo) = 
0, we call xo a stationary point of f. 


Theorem 4.20 says that if f : O — R has partial derivatives at xo, a necessary 
condition for xp to be a local extremizer is that it is a stationary point. 


Example 4.29 


For all the three functions f, g and h defined in Example 4.28, the point 


O = (0,0) is a stationary point. However, 0 is local minimizer of f, a local 


maximizer of g, but neither a local maximizer nor a local minimizer of h. 


The behavior of the function h(x, y) = x? — y” in Example 4.28 prompts us 
to make the following definition. 


Chapter 4. Differentiating Functions of Several Variables 274 


Definition 4.19 Saddle Points 


Let O be an open subset of IR” that contains the point xo, and let f: O > R 
be a function defined on O. The point xg is a saddle point of the function f 


if it is a stationary point of f, but it is not a local extremizer. In other words, 
V f (xo) = 0, but for any 6 > 0, there exist x; and x2 in B(x9,6) MO such 


that 
f(x1) > f(xo) and f(x2) < f (xo). 


Example 4.30 


(0,0) is a saddle point of the function h : 


By definition, if x9 is a stationary point of the function f : O — R, then it is 


either a local maximizer, a local minimizer, or a saddle point. If f : O — R has 
continuous second order partial derivatives at x9, we can use the second derivative 
test to partially determine whether xo is a local maximizer, a local minimizer, or 
a saddle point. When n = 1, we have seen that a stationary point x of a function 
f is a local minimum if f’(xo) > 0. It is a local maximum if f(z) < 0. For 
multivariable functions, it is natural to expect that whether xp is a local extremizer 
depends on the definiteness of the Hessian matrix H f(xo). 

In Section 2.1, we have discussed the classification of a symmetric matrix. It 
is either positive semi-definite, negative semi-definite or indefinite. Among the 
positive semi-definite ones, there are those that are positive definite. Among the 
negative semi-definite matrices, there are those which are negative definite. 


Theorem 4.21 Second Derivative Test 


Let O be an open subset of IR”, and let f : O — R be a twice continuously 
differentiable function defined on O. Assume that xo is a stationary point 
of f: OR. 


(i) If H;(xo) is positive definite, then xo is a local minimizer of f. 


(ii) If Hy(xo) is negative definite, then xo is a local maximizer of f. 


(iii) If H (xo) is indefinite, then xo is a saddle point. 


Chapter 4. Differentiating Functions of Several Variables 215 


The cases that are not covered in the second derivative test are the cases where 
H (xo) is positive semi-definite but not positive definite, or H s(x) is negative 
semi-definite but not negative definite. These are the inconclusive cases. 


Proof of the Second Derivative Test 
Notice that (i) and (ii) are equivalent since xo is a local minimizer of f if 
and only if it is a local maximizer of —f, and H_; = —Hy. A symmetric 
matrix A is positive definite if and only if —A is negative definite. Thus, 
we only need to prove (i) and (iii). 
Since Xo is a stationary point, Vf (xo) = 0. It follows from the second 
order approximation theorem that 


jam £O%0+h) = (xo) = gh” Hy (xo) 


lim INE = i) (4.15) 


To prove (i), asume that H (xo) is positive definite. By Theorem 2.9, there 
is a positive number c such that 


h’ Hy (xo)h > e|lhl|? for all h € R”. 


Eq. 4.15 implies that there is a d > 0 such that B(x, 6) C O and for all h 
with 0 < ||h|| < 6, 


i hy) ae et 


3" 


Therefore, 


fa Ga = sh Hs) < sli? for all ||hl| < 6. 


This implies that for all h with ||h|| < 6, 


il Cc C 
f (Xo +h) — f(%o) > sh” Hy(xo)b — 5b)? > & Ilha? > 0. 


Thus, f(x) > f(xo) for all x € B(xo,6). This shows that xg is a local 
minimizer of f. 


Chapter 4. Differentiating Functions of Several Variables 276 


Now to prove (iii), assume that H (xo) is indefinite. Then there exist unit 
vectors u; and uy so that 


6,=u H Xo)u, < 0, é9=us,H Xq)Ug > 0. 
Ley Deeey 


Lete = + min{|ei|, €2}. Eq. (4.15) implies that there is a 69 > 0 such that 
B(xo, 60) C O and for all h with 0 < ||h|| < do, 


i 
Peg =21n) = ea) = 5h! H;(xo)h < e||h||?. (4.16) 


For any 5 > 0, let r = $ min{6, do}. Then the points x; = Xo + ru; and 
X_ = Xo + ru, are in the ball B(xo, 6) and the ball B(x, 69). Eq. (4.16) 
implies that for 2 = 1, 2, 


yr 


Se f(xo +ru;) — f(xo) - yy Ay (Xo) ui aie 


Therefore, 


f (Xo +1ru1) — f(xo) <r? (Su Hy (xo + :) =i (52: + :) < 


since « < —$e1; while 


f (xo + rus) — f(X0) >? (S13 (x) = :) re (es = | = 0 


2 2 


while f (x2) > f(xo). These show that xo is a saddle point. 


A symmetric matrix is positive definite if and only if all its eigenvalues are 
positive. It is negative definite if and only if all its eigenvalues are negative. It 
is indefinite if it has at least one positive eigenvalue, and at least one negative 
eigenvalue. For a diagonal matrix, its eigenvalues are the entries on the diagonal. 


Let us revisit Example 4.28. 


Chapter 4. Differentiating Functions of Several Variables 217 


Example 4.31 


For the functions considered in Example 4.28, we have seen that (0,0) is a 


2) 
stationary point of each of them. Notice that H;(0,0) = F j is positive 


Ue 2 
is indefinite. Therefore, (0,0) is a local minimizer of f, a local maximizer 


—2 0 2 O 
defmites 5 (0 10) — is negative definite, H),(0,0) = | 


of g, and a saddle point of h. 


Now let us look at an example which shows that when the Hessian matrix is 
positive semi-definite but not positive definite, we cannot make any conclusion 
about the nature of a stationary point. 


Example 4.32 


Consider the functions f : R? > R and g : R? > R given respectively by 
(OMe 0 WOW) =m i 


These are infinitely differentiable functions. It is easy to check that (0, 0) is 
a stationary point of both of them. Now, 


H,(0,0) = H,(0,0) = F | 


is a positive semi-definite matrix. However, (0,0) is a local minimizer of 


f, but a saddle point of g. 


To determine the definiteness of an n xX n symmetric matrix by looking at 
the sign of its eigenvalues is ineffective when n > 3. There is an easier way to 
determine whether a symmetric matrix is positive definite. Let us first introduce 
the definition of principal submatrices. 


Chapter 4. Differentiating Functions of Several Variables 218 


Definition 4.20 Principal Submatrices 


Let A be ann X n matrix. For 1 < k < n, the k'*_principal submatrix M;, 


of A is the k x k matrix consists of the first k rows and first k columns of 
A. 


Example 4.33 


For the matrix A 


submatrices are 


respectively. 


Theorem 4.22 Sylvester’s Criterion for Positive Definiteness 


Ann Xn symmetric matrix A is positive definite if and only if det M;, > 0 
for all 1 < k <n, where M,, is its k principal submatrix. 


The proof of this theorem is given in Appendix A. Using the fact that a symmetric 
matrix A is negative definite if and only if —A is positive definite, it is easy to 
obtain a criterion for a symmetric matrix to be negative definite in terms of the 
determinants of its principal submatrices. 


Theorem 4.23 Sylvester’s Criterion for Negative Definiteness 


An n X n symmetric matrix A is negative definite if and only if 


(—1)* det M;, > 0 for all 1 < k < n, where M, is its k" principal 


submatrix. 


Chapter 4. Differentiating Functions of Several Variables 219 


Example 4.34 


Consider the matrix 
1 2 -3 
A= ]-1 4 2 
—-3 5 8 
Since 
det M, = 1, det M2 = 6, det M3 = det A= 5 


are all positive, A is positive definite. 


For a function f : O — R defined on an open subset O of R?, we have the 
following. 


Theorem 4.24 


Let O be an open subset of R*. Suppose that (xo, yo) is a stationary point 


of the twice continuously differentiable function f : O — R. Let 


8g? Q? ) 2 
D(Zo, Yo) = 55 (to Bo) 55 (0, vo) = apy (to 0) é 


Oo ; : 
(i) If = (0,0) > 0 and D(xo, yo) > 0, then the point (xo, yo) is a 
ic 
local minimizer of f. 


far 
(ii) If < (0. ) < 0 and D(x, yo) > 0, then the point (xo, yo) is a 
be 


local maximizer of f. 


(iti) If D(x, yo) < 0, the point (2%, yo) is a saddle point of f. 


We notice that 


ces ) ey ) 
Ay? 0, Yo dx0y 0; Yo 


H;(2o, Yo) as 


Of Of 


axdy Gir Yo) Dyer Yo) 


Chapter 4. Differentiating Functions of Several Variables 280 


om a 
Hence, Zh Yo) is the determinant of the first principal submatrix of 


H;(2o, Nal, while D(xo, yo) is the determinant of H (xo, yo), the second 
principal submatrix of H;(xo, yo). Thus, (i) and (ii) follow from the 
Sylvester criteria as well as the second derivative test. 

For (iii), we notice that the 2 x 2 matrix H;(o, yo) is indefinite if and only 
if it has one positive eigenvalue and one negative eigenvalue, if and only if 
D(z, yo) = det Hs(x0, yo) < 0. 


Now we look at some examples of the applications of the second derivative 
test. 


Example 4.35 


IR be the function defined as 
f(a,y) = a4 + y* + 4ay. 
Find the stationary points of f and classify them. 


Solution 
Since f is a polynomial function, it is infinitely differentiable. 


Vi (a, y) = (4a? + 4y, 4y? + 42). 
To find the stationary points, we need to solve the system of equations 
x+y=0 
ytr2=0 


From the first equation, we have y = —<°. 


Substitute into the second 
equation gives 


—r’+2=0, 


or equivalently, 


ies 11) (0) 


Chapter 4. Differentiating Functions of Several Variables 281 


ihus, a = Vora — =) When — Noy = 0. When 7 — =e ye — a 
Therefore, the stationary points of f are u; = (0,0), ug = (1,—1) and 
u3 = (—1, 1). Now, 


A;(z,y) = | 


12754 
A 12y?]- 


Therefore, 


It follows that 


Since fr2(u2) = frr(u3) = 12 > 0, we conclude that u, is a saddle point, 
Uy and uz are local minimizers. 


Figure 4.11: The function f(x,y) = 74 + y4 + 4ry. 


Chapter 4. Differentiating Functions of Several Variables 282 


Example 4.36 


Consider the function f : R® > R defined as 
f(z, y,z) = 2° — zy? + 5a? — 4xy — 2az 4+ y? + Gyz + 372”. 


Show that (0,0, 0) is a local minimizer of f. 


Solution 
Since f is a polynomial function, it is infinitely differentiable. Since 


Vi(e4y,2) = (Br—y 1 l0r—4y—22, —2ay—40 1 26, —27-6y-4 742), 


we find that 
Vf(0, 0,0) = (0,0, 0). 


Hence, (0,0, 0) is a stationary point. 
Now, 
6e+10 —2y-4 -2 
Ay(a,y,z) = |-2y-—4 -27+2 6 
—2 6 


Therefore, 


fel (OO) (0) = 


The determinants of the three principal submatrices of H (0,0, 0) are 


10 —4 
det M, = 10, det My = a =) 


2 
detM3=|-4. 2 6|=24. 
S48 


This shows that H;(0,0,0) is positive definite. Hence, (0,0,0) is a local 


minimizer of f. 


Chapter 4. Differentiating Functions of Several Variables 283 


Exercises 4.5 


Question 1 


Let f : R? — R be the function defined as 
f(a,y) = 2? + 4y? + dary — 8a — 1ly +7. 


Find the stationary points of f and classify them. 


Question 2 


Let f : R? > R be the function defined as 
f(z, y) = 2? + 4y? + 3ay — 5a — 18y +1. 


Find the stationary points of f and classify them. 


Question 3 


Let f : R? > R be the function defined as 
f(z,y) =a? + y? + lazy. 


Find the stationary points of f and classify them. 


Question 4 


Consider the function f : R® — R defined as 


f(a,y,z) = 2-227 —2 -y’ -—sy+n-y. 


Show that (1, —1,0) is a stationary point of f and determine the nature of 
this stationary point. 


Chapter 4. Differentiating Functions of Several Variables 284 


Question 5 


Consider the function f : R®? > R defined as 


f(a,y,2) = 2 +222 -—2? —y -—asy+e-y. 


Show that (1, —1, 0) is a stationary point of f and determine the nature of 


this stationary point. 


Chapter 5. The Inverse and Implicit Function Theorems 285 


Chapter 5 
The Inverse and Implicit Function Theorems 


In this chapter, we discuss the inverse function theorem and implicit function 


theorem, which are two important theorems in multivariable analysis. Given a 


function that maps a subset of R” to R”, the inverse function theorem gives 


sufficient conditions for the existence of a local inverse and its differentiability. 
Given a system of m equations with n+ variables, the implicit function theorem 
gives sufficient conditions to solve m of the variables in terms of the other n 
variables locally such that the solutions are differentiable functions. We want to 
emphasize that these theorems are /ocal, in the sense that each of them asserts the 
existence of a function defined in a neighbourhood of a point. 

In some sense, the two theorems are equivalent, which means one can deduce 
one from the other. In this book, we will prove the inverse function theorem first, 


and use it to deduce the implicit function theorem. 


5.1 The Inverse Function Theorem 


Let D be a subset of R”. If the function F : ® — R” is one-to-one, we can define 


the inverse function F~' : F(D) — R"”. The question we want to study here is 


the following. If © is an open set and F is differentiable at the point xp in D, is 
the inverse function F~' differentiable at yp = F (xo)? For this, we also want the 
point yo to be an interior point of F(D). More precisely, is there a neighbourhood 
U of Xo that is mapped bijectively by F to a neighbourhood V of yo? If the answer 
is yes, and F~' is differentiable at yo, then the chain rule would imply that 


DF (yo)DF (xo) = In. 


Hence, a necessary condition for F~! to be differentiable at yo is that the derivative 
matrix DF(xo) has to be invertible. 


Chapter 5. The Inverse and Implicit Function Theorems 286 


Let us study the map f : R — R given by f(x) = x”. The range of the 


function is [0,0o). Notice that if xo > 0, then J = (0, 00) is a neighbourhood of 
xo that is mapped bijectively by f to the neighbourhood J = (0,00) of f(xo). If 
ro < 0, then J = (—oo,0) is a neighbourhood of 9 that is mapped bijectively 
by f to the neighbourhood J = (0,00) of f (xo). However, if x9 = 0, the point 
f(xo) = 0 is not an interior point of f(R) = [0,00). Notice that f’(x) = 2z. 
Therefore, x = 0 is the point which f’(xz) = 0. 

If xo > 0, take J = (0,00) and J = (0,00). Then f : J — J has an inverse 
given by f-': J > I, f-'(x) = v2. It is a differentiable function with 


oe 1 

2/90 2x0 f"(#0) 

Similarly, if x) < 0, take J = (—oo,0) and J = (0,00). Then f : J > J has an 
inverse given by f~': J > I, f~'(x) = —\/z. It is a differentiable function with 


In particular, at yo = f (ao) = 2%, 
1 1 


-ly — 1 _ = 
(f PAO) 5 Ta Bae, FR) 


For a single variable function, the inverse function theorem takes the following 


form. 


Theorem 5.1 (Single Variable) Inverse Function Theorem 


Let O be an open subset of R that contains the point xo, and let f : O > 


R be a continuously differentiable function defined on O. Suppose that 


f'(xo) # 0. Then there exists an open interval J containing x9 such that f 
maps I bijectively onto the open interval J = f(/). The inverse function 


f~1 : J > Tis continuously differentiable. For any y € JJ, if x is the point 
in J such that f(x) = y, then 


Chapter 5. The Inverse and Implicit Function Theorems 287 


Figure 5.1: The function f : R > R, f(x) = 2”. 


Without loss of generality, assume that f’(x9) > 0. Since O is an open set 
and f’ is continuous at xo, there is an, > 0 such that (zgp—11, V9 +71) C O 
and for all x € (%) —11,%) +11), 
f'(2o) 
@)- Go| < 2. 
This implies that 


fo: 


0) for all x € (%o — 11, 2% +711). 


Therefore, f is strictly increasing on (% —11,% +11). Take any r > 0 that 
is less that r;. Then [x —7r,x+ 7] C (ao — 171,%0 +11). By intermediate 
value theorem, the function f maps [x — r,x + r] bijectively onto [f(a — 
Pi jie rn) Lei —(¢ —1 x24) and | — (fle —7). f(7 1), Then 
f : I — J isa bijection and f~! : J > I exists. In volume I, we have 
proved that f—' is differentiable, and 


ec eee 
Say) 


This formula shows that (f~')’ : J > R is continuous. 


Ca) = 


for all y € J. 


Chapter 5. The Inverse and Implicit Function Theorems 288 


Remark 5.1 


In the inverse function theorem, we determine the invertibility of the 
function in a neighbourhood of a point x9. The theorem says that if f is 
continuously differentiable and f’(xo) 4 0, then f is locally invertible at 
xo. Here the assumption that f’ is continuous is essential. In volume I, we 


have seen that for a continuous function f : J — R defined on an open 
interval J to be one-to-one, it is necessary that it is strictly monotonic. The 
function f : R- R, 


1 

vpatsin (2), if eo 0, 
x 

0, ita =, 


is an example of a differentiable function where f’(0) = 1 # 0, but f fails 
to be strictly monotonic in any neighbourhood of the point x = 0. 

This annoying behavior can be removed if we assume that f’ is continuous. 
If f’(xo) 4 0 and f’ is continuous, there is a neighbourhood J of x9 such 
that f’(x) has the same sign as f’(xo) for all x € J. This implies that f is 
strictly monotonic on J. 


Example 5.1 


IR be the function defined as 


fle) = 22+ 4cosz. 


Show that there is an open interval J containing 0 such that f : J — R is 


one-to-one, and f~! : f(Z) — R is continuously differentiable. Determine 


(f°) (FO). 


Chapter 5. The Inverse and Implicit Function Theorems 289 


Solution 
The function f is infinitely differentiable and f’(x) = 2 — 4sinz. Since 
f'(0) = 2 4 0, the inverse function theorem says that there is an open 


interval J containing 0 such that f : J — R is one-to-one, and f~! : 


f(Z) — Ris continuously differentiable. Moreover, 


Now let us consider functions defined on open subsets of IR”, where n > 2. 


We first consider a linear transformation T : R” — R”. There is an n x n matrix 
A such that 


T(x) = Ax. 


The mapping T : R” — Rv” is one-to-one if and only if A is invertible, if and 
only if det A # 0. In this case, T is a bijection and T~! : R” — R” is the linear 


transformation given by 


T 1 (x) = At. 


Notice that for any x and y in R”, 
DT(x) = A, Dr (y)=A", 
The content of the inverse function theorem is to extend this to nonlinear mappings. 


Theorem 5.2 Inverse Function Theorem 


Let O be an open subset of IR” that contains the point xo, and let F : 


O — R” be a continuously differentiable function defined on O. If 
det DF (xo) 4 0, then we have the followings. 


(i) There exists a neighbourhood U of xo such that F maps U bijectively 
onto the open set V = F(U). 


(ii) The inverse function F~' : V — U is continuously differentiable. 


(iii) For any y € V, if x is the point in U such that F(x) = y, then 


DF-\(y) = DF(F-'(y))"! = DF(x)". 


Chapter 5. The Inverse and Implicit Function Theorems 290 


Figure 5.2: The inverse function theorem. 


For a linear transformation which is a degree one polynomial mapping, the 
inverse function theorem holds globally. For a general continuously differentiable 
mapping, the inverse function theorem says that the first order approximation of 
the function at a point can determine the local invertibility of the function at that 
point. 

When n > 2, the proof of the inverse function theorem is substantially more 
complicated than the n = 1 case, as we do not have the monotonicity argument 
used in the n = 1 case. The proof will be presented in Section 5.2. We will 
discuss the examples and applications in this section. 


Example 5.2 


R? be the mapping defined by 


F(z, y) = (8a — 2y + 7, 4a + 5y — 2). 


Show that F is a bijection, and find F~'(x, y) and DF“! (a, y). 


Solution 


The mapping F : R* > R? can be written as F(x) = T(x) + b, where 
T : R* > R’ is the linear transformation 


T(z, y) = (82 — 2y, 4x + 5y), 


Chapter 5. The Inverse and Implicit Function Theorems 291 


4 5 
Since det A = 23 4 0, the linear transformation T : R? > R? is one- 


3s 
and b = (7,—2). For u = (z,y), T(u) = Au, where A = 


to-one. Hence, F : R? — R? is also one-to-one. Given v € R?’, let 
u = A-'(v —b). Then F(u) = v. Hence, F is also onto. The inverse 
F-!: R? > R’ is given by 


we find that 


= 5(a — 7) +2(y+2) —4(a — 7) + 3(y 4+ 2) 
Ft = 
(x,y) ( on os 
7 —— —4z + 3y + 34 


23 


Example 5.3 


Determine the values of a such that the mapping F : R® > R? defined by 
F(z, y,z) = (Qa +y+az,2—y4+3z,3e4+ 2y+2+7) 
is invertible. 


Solution 


The mapping F : R® — R° can be written as F(x) = T(x) + b, where 
T : R? — R’ is the linear transformation 


T(z, y,2z) = (2a +y+az,¢ —y+ 32,344 2y+ 2), 


Chapter 5. The Inverse and Implicit Function Theorems 292 


and b = (0,0, 7). Thus, F is a degree one polynomial mapping with 


The mapping F is invertible if and only if it is one-to-one, if and only if T 
is one-to-one, if and only if det DF(x) ¥ 0. Since 


det DF(x) = 5a — 6, 


the mapping F is invertible if and only if a 4 6/5. 


Example 5.4 


R? be the mapping defined as 


®(r,0) = (rcos0,rsin 6). 


Determine the points (r,@) € R* where the inverse function theorem can 
be applied to this mapping. Explain the significance of this result. 


Solution 
Since sin 6 and cos @ are infinitely differentiable functions, the mapping ® 
is infinitely differentiable with 


D®@(r, 0) = | 


cos@ —rsin | 


sinO rcosdé 


Since 
det D®(r,0) = rcos?0+rsin? 6 =r, 


the inverse function theorem is not applicable at the point (r, @) if r = 0. 
The mapping ® is a change from polar coordinates to rectangular 
coordinates. The result above shows that the change of coordinates is 
locally one-to-one away from the origin of the xy-plane. 


Chapter 5. The Inverse and Implicit Function Theorems 


293 


Example 5.5 


Consider the mapping F : R? — R? given by 
EG, y) = Ge = Ue Tu). 


Show that there is a neighbourhood U of the point uo = (1, 1) 


such that F : 


U — R? is one-to-one, V = F(U) is an open set, and G = F-!:V =U 


OG 
is continuously differentiable. Then find 3, 0 1h 
y 


Solution 
The mapping F is a polynomial mapping. Thus, it is 
differentiable. Notice that F(up) = (0,1) and 


continuously 


Since det DF (up) = 4 ¥ 0, the inverse function theorem implies that there 


is a neighbourhood U of the point up such that F : U — | 
one, V = F(U) is an open set, and G = F-!: V > Uis 
differentiable. Moreover, 


DEON) —DEIs — 


From here, we find that 


Example 5.6 


Consider the system of equations 


sin(e a) ay Say = 2, 
0 ye 


R? is one-to- 


continuously 


Chapter 5. The Inverse and Implicit Function Theorems 294 


Observe that (x, y) = (1, —1) is a solution of this system. Show that there 
is a neighbourhood U of up = (1, —1) and an r > 0 such that for all (a, b) 
satisfying (a — 2)? + (b— 1)? < r?, the system 


sine ty) a y+ Say — a, 
Icy + 5a? — 2y? =b 


has a unique solution (z, y) that lies in U. 


Solution 
R? — R? be the function defined by 


Baa (sin(x ty) + a7y + 382y?, Qry + 52? — 2y’) : 


Since the sine function is infinitely differentiable, sin(x + y) is infinitely 
differentiable. The functions g(z,y) = x?y + 3xy? and Fy(z,y) = 
2ry + 5x? — 2y? are polynomial functions. Hence, they are also infinitely 
differentiable. This shows that F is infinitely differentiable. Since 


cos(x + y) + 2ry + 3y? cos(x + y) + x? + Gry 
eu La 2a — 4y 


DF(1,—1) = E | ; 


5) 


DF(z,y) = | 


we find that 


8 6 


It follows that det DF(1, —1) = 44 4 0. 
By the inverse function theorem, there exists a neighbourhood U, of uo 


such that F : U, — R? is one-to-one and V = F(U;) is an open set. 
Since F(ug) = (2,1), the point vo = (2,1) is a point in the open set 
V. Hence, there exists r > 0 such that B(vo,r) C V. Since B(vo,7r) 
is open and F is continuous, U = F~!(B(vo,1r)) is an open subset of 
R?. The map F : U + B(vo,r) is a bijection. For all (a,b) satisfying 
(a — 2)? + (b—1)? < r?, (a,b) is in B(vo,r). Hence, there is a unique 
(x, y) in U such that F(x, y) = (a,b). This means that the system 


Chapter 5. The Inverse and Implicit Function Theorems 295 


sin(x + y) + 2’y 4+ 32y* =a, 


Icy + 5a? — 2y* =b 


has a unique solution (, y) that lies in U. 


At the end of this section, let us prove the following theorem. 


Theorem 5.3 


Let A be an n x n matrix, and let xp and yo be two points in R”. Define the 


mapping F : R” — R” by 


F(x) = yo + A(x — Xp). 


Then F is infinitely differentiable with DF(x) = A. It is one-to-one and 
onto if and only if det A ¥ 0. In this case, 


F-'(y)=xo0 +A‘ (y—yo), and DF '(y)=A™. 


In particular, F~! is also infinitely differentiable. 


Obviously, F is a polynomial mapping. Hence, F is infinitely differentiable. 
By a straightforward computation, we find that DF = A. 

Notice that F = F. o T o F,, where F,; : R” — R” is the translation 
F\(x) = x — Xo, T : R"” > R’ is the linear transformation T(x) = Ax, 


and F, : R” — R’ is the translation F2(y) = y+ yo. Since translations are 


bijective mappings, F is one-to-one and onto if and only if T : R” > R” 
is one-to-one and onto, if and only if det A 4 0. 
If 
Y=yot A(x—Xp), 
then 
x=x0 +A" (y —yo)- 


This gives the formula for F~'(y). The formula for DF‘ (y) follows. 


Chapter 5. The Inverse and Implicit Function Theorems 296 


Exercises 5.1 


Question 1 


Let f : R > R be the function defined as 


=e** + A4rsinx + 2cosz. 


Show that there is an open interval J containing 0 such that f : J > Ris 


one-to-one, and f~! : f(I) — R is continuously differentiable. Determine 


(f-")(F(0)). 


Question 2 


IR? be the mapping defined by 
F(z, y) = (3a + 2y — 5, 7x + 4y — 3). 


Show that F is a bijection, and find F~1(x,y) and DF“ '(z, y). 


Question 3 


Consider the mapping F : R? — R? given by 
F(x,y) = (2° +y", ay). 


Show that there is a neighbourhood U of the point up = (2, 1) such that F : 
U — R? is one-to-one, V = F(U) is an open set, and G =F-!:V =U 


OG 
is continuously differentiable. Then find ae 2): 
Bp 


Question 4 


R? — R° be the mapping defined as 


®(p, 0,0) = (psin ¢cos 8, psin dsin 0, pcos ©). 


Determine the points (p, ¢, 9) € R® where the inverse function theorem can 
be applied to this mapping. Explain the significance of this result. 


Chapter 5. The Inverse and Implicit Function Theorems 297 


Question 5 


Consider the system of equations 


4¢5 f Oty — 2 
oye = a 


Observe that (x, y) = (—1, 1) is a solution of this system. Show that there 


is a neighbourhood U of ug = (—1, 1) and an r > 0 such that for all (a, b) 
satisfying (a — 2)? + (b — 5)? < r?, the system 


4g + y — ory =a, 
3 — 


has a unique solution (x, y) that lies in U. 


Chapter 5. The Inverse and Implicit Function Theorems 298 


5.2 The Proof of the Inverse Function Theorem 


In this section, we prove the inverse function theorem stated in Theorem 5.2. 
The hardest part of the proof is the first statement, which asserts that there is a 
neighbourhood U of xo such that restricted to U, F is one-to-one, and the image 


of U under F is open in R”. 


In the statement of the inverse function theorem, we assume that the derivative 


matrix of the continuously differentiable mapping F : O — R” is invertible at the 
point xo. The continuities of the partial derivatives of F then implies that there is 
a neighbourhood NV of xo such that the derivative matrix of F at any x in N is 


also invertible. 


Theorem 3.38 asserts that a linear transformation T : R” — R” is invertible 


if and only if there is a positive constant c such that 


||T(u) — T(v)|| > cellu — v]| for all u,v € R”. 


Definition 5.1 Stable Mappings 


A mapping F :  — R" is stable if there is a positive constant c such that 


|F(u) — F(v)|| > cllu — v|| for all u,v € D. 


In other words, a linear transformation T : R” — R” is invertible if and only 
if it is stable. 


Remark 5.2 Stable Mappings vs Lipschitz Mappings 


Let D be a subset of IR”. Observe that if F : ® — R” is a stable mapping, 
there is a constant c > 0 such that 


||F(u,) — F(ug)|| > clluz — ug] for all u,, Us € D. 


This implies that F is one-to-one, and thus the inverse F~! : F(D) > 1 
exists. Notice that for any v, and v2 in F(D), 


~ 7 1 
F-*(v1) — F (v2) || < ale — Voll. 


This means that F~! : F(D) — R” is a Lipschitz mapping. 


Chapter 5. The Inverse and Implicit Function Theorems 299 


For a mapping F : 9 — Rv” that satisfies the assumptions in the statement of 
the inverse function theorem, it is stable in a neighbourhood of xo. 


Theorem 5.4 


Let O be an open subset of IR” that contains the point xo, and let F : 


O — R” be a continuously differentiable function defined on O. If 


det DF(x,)) # 0, then there exists a neighbourhood U of xo such that 
DF‘(x) is invertible for all x € U, F maps U bijectively onto the open set 
V = F(U), and the map F : U — V is stable. 


Recall that when A is a subset of R”, u is a point in R”, 
A+u={a+ulae A} 


is the translate of the set A by the vector u. The set A is open if and only if A+u 
is open, A is closed if and only if A + u is closed. 


Lemma 5.5 


It is sufficient to prove Theorem 5.4 when xp = 0, F(xo) 
DF(xo) = In. 


Proof of Lemma 5.5 
Assume that Theorem 5.4 holds when xp = 0, F(x9) = 0 and DF (x) = 
ve 
Now given that F : O — R" is acontinuously differentiable mapping with 
det DF (xo) 4 0, let yo = F(xo) and A = DF(xo). Then A is invertible. 
Define the open set D as D = O — xp. It is a neighbourhood of the point 
0. Let G : D > R” be the mapping 


G(x) = A™* (F(x + x0) — yo). 


Then G(0) = O. Using the same reasoning as the proof of Theorem 5.3, 
we find that G is continuously differentiable and 


DG(x) = A7'DF(x + xp). 


Chapter 5. The Inverse and Implicit Function Theorems 300 


This gives 


DG(0) = A7'DF (xo) = In. 


By assumption, Theorem 5.4 holds for the mapping G. Namely, there exist 
neighbourhoods U/ and VY of 0 such that G : U — YV is a bijection and 
DG(x) is invertible for all x € U/. Moreover, there is a positive constant a 
such that 


||G(uz) — G(ug)|| > aljur — ual for all u;, u2 € U. 


Let U be the neighbourhood of xo given by U = U + xo. By Theorem 5.3, 
the mapping H : R” > R”, 


H(y) = A“'(y — yo) 


is a continuous bijection. Therefore, V = H~'!(V) is an open subset of | 
that contains yo. By definition, F maps U bijectively to V. Since 


F(x) = yo + AG(x — xo), 


we find that 
DF (x) = A(DG(x — xo)). 


Since A is invertible, DF (x) is invertible for all x € U. Theorem 3.38 says 
that there is a positive constant a such that 


|| Ax|| > a||>x|| for all x € 


Therefore, for any u, and ug in U, 


|F (ui) — F(ue)|| = ||A (GQ — x0) — G(u2 — x0)) || 
> a||G(ui — Xo) — G(u2 — Xo)|| 
> aa||u, — ual. 


This shows that F : U — V is stable, and thus completes the proof of the 


lemma. 


Now we prove Theorem 5.4. 


Chapter 5. The Inverse and Implicit Function Theorems 301 


Proof of Theorem 5.4 
By Lemma 5.5, we only need to consider the case where xp = 0, F(xo) = 0 
and DF (xo) = In. 
Since F : O > R” is continuously differentiable, the map DF : O > M,, 


is continuous. Since det : M,, — R is also continuous, and det DF(0) = 
1, there is an r9 > O such that B(O,ro) C O and for all x € B(0,7r9), 
det DF(x) > $. In particular, DF(x) is invertible for all x € B(0, 70). 
Let G : O — R” be the mapping defined as 


G(x) = F(x) —x, 


so that F(x) = x + G(x). The mapping G is continuosly differentiable. It 
satisfies G(0) = 0 and 


DG(0) = DF(0) — I, = 0. 


Since G is continuously differentiable, for any 1 <7<n,1 <j <n, there 
exists r;,; > 0 such that B(O,7r;,;) C O and for all x € B(0,7r;,;), 


Cx e | a Ca, (x) _ Olin: (0) = 


De 
Let 
(aa ES eS yp 0, Overy ie 


Thenr > 0, B(O,r) C B(O,7ro) and B(O,r) C B(O,7;,;) forall 1 <i <n, 
1 <j <n. The ball B(0,7r) is a convex set. If u and v are two points 


in B(O,r), mean value theorem implies that for 1 < i < n, there exists 
z; € B(0,r) such that 


Chapter 5. The Inverse and Implicit Function Theorems 302 


Therefore, 


\|G(u) — G(v)|| = 


This shows that G : B(0,r) — R" is a map satisfying G(O) = 0, and 


1 
||G(u) — G(v)|| < 5ilu— vil for all u,v € B(O,r). 


By Theorem 2.44, the map F : B(0,7r) — R" is one-to-one, and its image 
contains the open ball B(0,r/2). Let V = B(0,r/2). Then V is an open 
subset of R” that is contained in the image of F. Since F : B(0,r) > R” 


is continuous, U = laa) is an open set. By definition, F : U > V 
is a bijection. Since U is contained in B(0, 79), DF (x) is invertible for all 
x in U. Finally, for any u and v in U, 


|[F(u) — F(v)|| 2 [lu — vl — |G(u) — GW) || = lu — vl 


This completes the proof of the theorem. 


To complete the proof of the inverse function theorem, it remains to prove that 


F-!: V > U is continuously differentiable, and 
DF '(y) = DF(F '(y))*. 


Theorem 5.6 


Let O be an open subset of IR” that contains the point xo, and let F : 


O -— > R" be a continuously differentiable function defined on O. If 
det DF(xo) # 0, then there exists a neighbourhood U of xo such that 
F maps U bijectively onto the open set V = F(U), the inverse function 


F~!: V — U is continuously differentiable, and for any y € V, if x is the 
point in U such that F(x) = y, then 


DF '(y) = DF(x)"t. 


Chapter 5. The Inverse and Implicit Function Theorems 303 


Theorem 5.4 asserts that there exists a neighbourhood U of xo such that F 
maps U bijectively onto the open set V = F(U), DF (x) is invertible for 
all x in U, and there is a positive constant c such that 


|F(u1) — F(uz)|| > cllur — uy] for all uy, ug € U. (5.1) 


Now given y in V, we want to show that F~ is differentiable at y and 
DF '(y) = DF(x)~!, where x = F~'(y). Since V is open, there is an 
r > O such that B(y,r) C V. For k € R” such that ||k|| < r, let 


h(k) = F"'(y +k) — F “(y). 


Then 
F(x)=y and F(x+h)=y+k. 


Eq. (5.1) implies that 
1 
h|| < —||k\l. 
tl < [ik 


Let A = DF (x). By assumption, A is invertible. Notice that 


EF (y+k)—F (y)—A k=—A (k— Ah) 
= —A-! (F(x +h) — F(x) — Ah). 


There is a positive constant 3 such that 


|A~yll < Bllyl| forall y €| 


Therefore, 


[ease iEeate| 
I|k| 


B Poe 
S pg! P& +h) — Fox) — Ah) | 


8 seer 
~ ¢ ||| 


Since F is differentiable at x, 


Chapter 5. The Inverse and Implicit Function Theorems 304 


Eq. (5.2) implies that lim h = 0. Eq. (5.3) then implies that 
—> 


F-U(y +k) — F-Uy) — Atk 
Tk] =e 


This proves that F~' is differentiable at y and 


Die Ges hae 


Now the map DF™! : V — GL(n,R) is the compositions of the maps 
F-1:V 3 U, DF: U > GL(n,R) and ¥ : GL(n,R) — GL(n,R) 
which takes A to A~!. Since each of these maps is continuous, the map 


DF! : V + GL(n,R) is continuous. This completes the proof that 


F-!: V ~ U is continuously differentiable. 


At the end of this section, let us give a brief discussion about the concept of 


homeomorphism and diffeomorphism. 


Definition 5.2 Homeomorphism 


Let A be a subset of R™ and let B be a subset of R”. We say that A 
and B are homeomorphic if there exists a continuous bijective function 
F : A + B whose inverse F-! : B > A is also continuous. Such a 
function F is called a homeomorphism between A and B. 


Definition 5.3 Diffeomorphism 


Let O and JU be open subsets of IR”. We say that U/ and O are diffeomorphic 
if there exists ahomeomorphism F : O — U/ between O and JU such that F 
and F~! are differentiable. 


Example 5.7 


Let A = {(z,y)| 2? + y? < 1} and B = {(z, y) | 4x? + 9y? < 36}. Define 
the map F : R? > R? by 


UG) =" (Gun AG) 


Chapter 5. The Inverse and Implicit Function Theorems 305 


Then F is an invertible linear transformation with 


The mappings F and F~! are continuously differentiable. It is easy to show 


that F maps A bijectively onto B. Hence, F : A — B is a diffeomorphism 
between A and B. 


Figure 5.3: A = {(x,y)|z?+y? <1} and B = {(z, y) |, 4x? + 9y? < 36} are 
diffeomorphic. 


Theorem 5.3 gives the following. 


Theorem 5.7 


Let A be an invertible n x n matrix, and let xg and yo be two points in | 
Define the mapping F : R” > R” by 


F(x) = yo + A(x — Xo). 


If O is an open subset of IR", then F : O + F(Q) is a diffeomorphism. 


The inverse function theorem gives the following. 


Chapter 5. The Inverse and Implicit Function Theorems 306 


Theorem 5.8 


Let O be an open subset of IR”, and let F : O > R” be a continuously 
differentiable mapping such that DF (x) is invertible for all x € O. If U is 


an open subset contained in O such that F : 7/ > R” is one-to-one, then 
F :U > F(U) is a diffeomorphism. 


The proof of this theorem is left as an exercise. 


Chapter 5. The Inverse and Implicit Function Theorems 307 


Exercises 5.2 


Question 1 


R? — R? be the mapping given by 


F(z, y) = (xe? + xy, 2x? + 3y’). 


Show that there is a neighbourhood U of (—1,0) such that the mapping 
F : U > R’ is stable. 


Question 2 


Let O be an open subset of R”, and let F : O — R” be a continuously 
differentiable mapping such that det DF (x) 4 0 for all x € O. Show that 
F(Q) is an open set. 


Question 3 


Let O be an open subset of R”, and let F : O — R” be a continuously 
differentiable mapping such that DF (x) is invertible for all x € O. If U is 
an open subset contained in O such that F : 4 > R” is one-to-one, then 
F : U > F(U) is a diffeomorphism. 


Question 4 


Let O be an open subset of IR”, and let F : O — R” be a differentiable 
mapping. Assume that there is a positive constant c such that 


|F(u) — F(v)|| > clu — v|| for all u,v € O. 


Use first order approximation theorem to show that for any x € O and any 
h € R”, 
|| DF(x)hl] > cl|hl]. 


Chapter 5. The Inverse and Implicit Function Theorems 308 


Question 5 


Let O be an open subset of R”, and let F : O — R” be a continuously 
differentiable mapping. 


(a) If F : O > R” is stable, show that the derivative matrix DF (x) is 


invertible at every x in O. 


(b) Assume that the derivative matrix DF (x) is invertible at every x in O. 


If Cis a compact subset of O, show that the mapping F : C — R” is 
stable. 


Chapter 5. The Inverse and Implicit Function Theorems 309 


5.3. The Implicit Function Theorem 


The implicit function theorem is about the possibility of solving m variables from 
a system of m equations with n + m variables. Let us study some special cases. 


Consider the function f : R? > R given by f(x,y) = 2? + y? —1. For 
a point (x9, yo) that satisfies f(x, yo) = 0, we want to ask whether there is a 


neighbourhood J of xo, a neighbourhood J of yo, and a function g : J + R such 
that for (x,y) € I x J, f(x, y) = O if and only if y = g(z). 


y 


Figure 5.4: The points in the (z, y) plane satisfying x? + y? — 1 =0. 


If (x9, yo) is a point with yo > 0 and f(x, yo) = 0, then we can take the 
neighbourhoods J = (—1,1) and J = (0, 00) of xp and yo respectively, and define 
the function g : J + R by 


Ho) SV 1a. 
We then find that for (7, y) € I x J, f(x,y) = 0 if and only if y = V1 — 2? = 
g(x). 
If (x9, yo) is a point with yo < 0 and f(x, yo) = 0, then we can take the 
neighbourhoods J = (—1,1) and J = (—co,0) of xo and yo respectively, and 


define the function g : J + R by 
g(x) = —V1—- 2?. 


We then find that for (x,y) € J x J, f(x,y) = 0 if and only if y= —V1—2? = 


g(x). 
However, if (%o, yo) = (1,0), any neighbourhood J of yo must contain an 
interval of the form (—r,r). If J is a neighbourhood of 1, (x,y) is a point in 


Chapter 5. The Inverse and Implicit Function Theorems 310 


I x (—r,r) such that f(x,y) = 0, then (x, —y) is another point in J x (—r,r) 
satisfying f(z,—y) = 0. This shows that there does not exist any function g : 
I —> R such that when (x,y) € I x J, f(x,y) = 0 if and only if y = g(z). 
We say that we cannot solve y as a function of x in a neighbourhood of the point 


(1,0). 
Similarly, we cannot solve y as a function of x in a neighbourhood of the point 
(= 1, 0) : 


However, in a neighbourhood of the points (1,0) and (—1,0), we can solve x 


as a function of y. 


For a function f : O — R defined on an open subset O of R?, the implicit 
function theorem takes the following form. 


Theorem 5.9 Dini’s Theorem 


Let O be an open subset of R? that contains the point (20, yo), and let 


f : O — R be acontinuously differentiable function defined on O such 


O 
that f(r, yo) = 0. If (a, 4) + (0), then there is a neighbourhood J 
y 
of 2, a neighbourhood J of yo, and a continuously differentiable function 
g : I + J such that for any (x,y) € I x J, f(x,y) = 0 if and only if 


y = g(x). Moreover, for any x € I, 


Dini’s theorem says that to be able to solve y as a function of x, a sufficient 
condition is that the function f has continuous partial derivatives, and f,, does not 
vanish. By interchanging the roles of x and y, we see that if f,, does not vanish, 


we can solve x as a function of y. 


For the function f : R? > R, f(x,y) = x? + y? — 1, the points on the set 
x” + y* = 1 which f(x,y) = 2y vanishes are the points (1,0) and (—1,0). In 
fact, we have seen that we cannot solve y as functions of x in neighbourhoods of 


these two points. 


Chapter 5. The Inverse and Implicit Function Theorems 311 


Proof of Dini’s Theorem 
Without loss of generality, assume that f,(xo, yo) > 0. Let uo = (20, yo). 


Since f, : O — R is continuous, there is an r; > 0 such that the closed 
rectangle R = [% — 11,20 +11] X [Yo — 71, Yo + 11] lies in O, and for all 
(Gre he ah oval) 20, FOR any © ety — Ian eo ys tne 
function hz : [yo — 71, Yo + 71] > R has derivative h/,(y) = fy(x, y) that 
is positive. Hence, h,(y) = g(x, y) is strictly increasing in y. This implies 
that 


PO o = Teo ft No en) 


When x = 20, we find that 


J CHO — ving all) ef Ovi eae 


Since f is continuously differentiable, it is continuous. Hence, there is an 
ra > O such that ro < ry, and for all x € [xp — ro, 29 + 19], 


TCR alte) <0 and J (2,40 + 11) > 0. 


Let I = (4% — ro, 20 + 12). For x € I, since hy : [yo — 71, yo + 71] > 
continuous, and 


he(Yo = r1) <O0< hz(Yo +711), 


intermediate value theorem implies that there is a y € (yo — 71, Yo + 171) 
such that h,(y) = 0. Since h, is strictly increasing, this y is unique, and 


we denote it by g(x). This defines the function g : J > R. Let J = 


(Yo —11,Yo +11). By our argument, for each x € I, y = g(x) is a unique 
y € J such that f(x,y) = 0. Thus, for any (x,y) € I x J, f(x,y) = Oif 
and only if y = g(x). 


It remains to prove that g : J — R is continuosly differentiable. By our 


convention above, there is a positive constant c such that 


for all (x,y) € I x J. 


Chapter 5. The Inverse and Implicit Function Theorems 312 


Fixed x € J. There exists an r > 0 such that (x —r,x+r) CI. Forh 
satisfying 0 < |h| < r,2+hisin J. By mean value theorem, there is a 
cn € (0, 1) such that 


Of 


SAGES Leelee 10) aCe) ele Clea) ie) 


where 
un = (x, 9(x)) + en(h, g(@ + h) — g(a). 
Since 
fla+h,g(@+h)) =0= f(x, g(x), 


find th 
we find that g(a+h)—g(z) _ _ fo(un) (5.5) 
i Ea | 


Since f, is continuous on the compact set R, it is bounded. Namely, there 


exists a constant / such that 
lfo(x, y)| = M for all (a, y) € R. 


Eq. (5.5) then implies that 


ola + h) — 9(x)| < Ih 


Taking h — 0 proves that g is continuous at x. From (5.4), we find that 


arnt — (reg he 


h-0 


Since f,, and f, are continuous at (x, g(x)), eq. (5.5) gives 


sem GEER) = 9) _ 
h—0 h 


This proves that g is differentiable at x and 


Se (e,9(2)) 7 Fa Ce ale))a(@) = 0. 


Chapter 5. The Inverse and Implicit Function Theorems 313 


Fy) <0 
ie 


1 2 


Figure 5.5: Proof of Dini’s Theorem. 


Example 5.8 
Consider the equation 
zy? +sin(z + y) + 42°y = 3. 


Show that in a neighbourhood of (—1,1), this equation defines y as a 
function of x. If this function is denoted as y = g(x), find g/(—1). 


Solution 
Let f : R? > R be the function defined as 


f(e,y) = vy® + sin(x + y) + 42?y — 3. 


Since sine function and polynomial functions are infinitely differentiable, 
f is infinitely differentiable. 
of 


3y 6 y) = 3ay* + cos(x + y) + 42”, 
y 


of 
Oy 


(-1,1)=240. 


By Dini’s theorem, there is a neighbourhood of (—1, 1) such that y can be 
solved as a function of x. Now, 

O O 
ane = ap ae cos(x + y) + 8xy, = 


=.) = =b. 
_ (-1,1) =-6 


Hence, g'(0) = 


Chapter 5. The Inverse and Implicit Function Theorems 314 


Now we turn to the general case. First we consider polynomial mappings of 
degree at most one. Let A = [a;;] be an m x n matrix, and let B = [b;;| be an 


m Xm matrix. Given x € R”, y € R™, c € R”, the system of equations 
Ax+ By=c 
is the following m equations in m + n variables 71,...,%n,Y1,---;Ym- 


A11L1 + Ayo%g +--+ + Ann + birvyr + Opeye +--+ + 0mm = C1, 
91 XL1 + Ag2h2 + +++ + Aan%n + bay + daay2 + +++ + damYm = C2; 


Ami L1 + Am2L2 +++ + AmnLn + bmi Yi ar Dm2Y2 Soe eee DU: = Cm: 
Let us look at an example. 
Example 5.9 


Consider the linear system 


221 + 3X0 — 543 + 2y, — yo = 1 
321 — Lo + 273 — 34, + Yo = 0 


Show that y = (y1,y2) can be solved as a function of x = (2, 22,23). 


Write down the function G : R* — R? such that the solution is given by 
y = G(x), and find DG(x). 


Solution 


Ae Al 


Then the system can be written as 


1 
Ax+ By =c, where c = A : 


This implies that 
By =c— Ax. 


Chapter 5. The Inverse and Implicit Function Theorems 315 


For every x € R®, c — Ax is a vector in R?. Since det B = —1 4 0, Bis 


invertible. Therefore, there is a unique y satisfying (5.6). It is given by 


G(x) = y = B+ (c — Ax) 


Lab 


1 9 2 —3 
=— -++ xX 
3 12 7 —-11 
OL] + 2X9 = 3X3 —1 
122, + 7x2 — 11z3 — 3] 


2 = 
It follows that DG = 2 2 ; 
12 7 —-11 


The following theorem gives a general scenario. 


Theorem 5.10 


Let A = |a;;| be an m x n matrix, and let B = [b;;| be an m x m matrix. 
Define the function F : R”*" > R™ by 


F(x, y) = AK ap sy iG. 


where c is a constant vector in R™. The equation F(x, y) = O defines the 


variable y = (Y1,.--,Ym) as a function of x = (2,...,2,) if and only if 


the matrix B is invertible. If we denote this function as G : R” > R”, 
then 


B 1 (c— Ax), 


and 


DG(x) = —B1A. 


Chapter 5. The Inverse and Implicit Function Theorems 316 


The equation F(x, y) = 0 defines the variables y as a function of x if and 


only for for each x € R”, there is a unique y € R” satisfying 


By =c— Ax. 


This is a linear system for the variable y. By the theory of linear algebra, 
a unique solution y exists if and only if B is invertible. In this case, the 
solution is given by 


y= Br (6@— Ax) 


The rest of the assertion follows. 


Write a point in R”*” as (x,y), where x € R” andy € R”. IfF: R™™” > 


R” is a function that is differentiable at the point (x, y), the m x (m+n) derivative 


matrix DF (x, y) can be written as 


DF (x,y) = [D.F(x,y) | DyFex.y)]. 


where 
OF, OF, OF, 
an, ' .Y) Bap) Da, | .y) 
OF, OF» OF 
(x,y) (x,y) (x,y) 
D,F(x,y) = Ox Ox2 OX 
OF., OF, Ops 
Da, OY) Dn, Or) aa, OY) 
OF, OF, OF, 
ErGe y) Oyo (x, y) OYm Re y) 
OF: OF: OF: 
Ant Ly) aa .y) a, 6Y) 
Dy F(x, y) om y ¥ oe 
OF, OF m OFS, 


Chapter 5. The Inverse and Implicit Function Theorems 317 


Notice that Dy F(x, y) is a square matrix. 


When A = [a;;| is an m x n matrix, B = [b,;| is an m x m matrix, c is a 


vector in R™, and F : R™*”" — R" is the function defined as 
F(x,y) = Ax+ By —c, 


it is easy to compute that 


D,.F (x,y) =A, D,F(x,y) = B. 
Theorem 5.10 says that we can solve y as a function of x from the system of m 
equations 
F(x,y) =0 
if and only if 
B= DyF(x,y) 


is invertible. In this case, if G : R” — R” is the function so that y = G(x) is the 
solution, then 
DG(x) = —B7'A = —D,F(x, y)"'D,.F(x, y). 


In fact, this latter follows from F(x, G(x)) = 0 and the chain rule. 
The special case of degree one polynomial mappings gives us sufficient insight 
into the general implicit function theorem. However, for nonlinear mappings, the 


conclusions can only be made locally. 


Theorem 5.11 Implicit Function Theorem 


Let O be an open subset of R”*”, and let F : O > R™ be a continuously 


differentiable function defined on O. Assume that xo is a point in IR” and 


Yo is a point in R™ such that the point (xo, yo) is in O and F(xo, yo) = 0. 
If det Dy F (xo, yo) # 0, then we have the followings. 


(i) There is a neighbourhood U of xo, a neighbourhood V of yo, and a 


continuously differentiable function G : U — R” such that for any 
(x,y) €U x V, F(x, y) = Oif and only if y = G(x). 


(ii) For any x € U, 


D,.F (x, G(x)) + Dy F(x, G(x))DG(x) = 0. 


Chapter 5. The Inverse and Implicit Function Theorems 318 


Here we will give a proof of the implicit function theorem using the inverse 
function theorem. The idea of the proof is to construct a mapping which one can 
apply the inverse function theorem. Let us look at an example first. 


Example 5.10 


R° —> R? be the function defined as 


F (x1, 23,23, 91, 2) = (x1y3, Coesyz + Tyo). 


Define the mapping H : R° > R° as 


H(x, y) am (eG PGs, y)) a (21, 2, £3, L1Ys, T2U3Yy aa £142). 


Then we find that 


0 
0 
i} 


0 0 


2 D 
T3Y, T2Yj 


Notice that 


Proof of the Implicit Function Theorem 


R’™”*” be the mapping defined as 


H(x, y) = (x, F(x,y)). 


Notice that F(x, y) = 0 if and only if H(x, y) = (x, 0). Since the first n 
components of H are infinitely differentiable functions, the mapping H : 
O — R™*” is continuously differentiable. 


Chapter 5. The Inverse and Implicit Function Theorems 319 


ie ie 
D,.F (x, y) | D, F(x, y) 


DH(x,y) = 


Therefore, 
det DH (xo, yo) = det Dy F (xo, yo) 4 0. 


By the inverse function theorem, there is a neighbourhood W of (xo, yo) 
and a neighbourhood Z of H(xo, yo) = (xXo,0) such that H : W — Z is 
a bijection and H~! : Z — W is continuously differentiable. For u € R”, 
v € R"™ so that (u,v) € Z, let 


H-(u, v) ae (®(u, v), W(u, v)), 


where ® is a map from Z to R” and W is a map from Z to R™. Since H! 


is continuously differentiable, ® and W are continuously differentiable. 


m+n 
Given r > 0, let D,. be the open cube D,. = I] (—r,r). Since W and Z are 
i=1 
open sets that contain (xo, yo) and (xo, 0) respectively, there exists r > 0 
such that 
(Xo, Yo) + D,. e W, (Xo, 0) + D,. E Lhe 
IfAy= [I= Cs a [[(-.>). U=x,)+A,,V =yo+ B,, then 


i=1 i=1 


(ajay ne, (x9,0) + D, =U x B,. 


Hence, U x V CW and U x B, C Z. Define G : U > R™ by 


G(x) = U(x, 0). 


Since W is continuously differentiable, G is continuously differentiable. If 
x € U,y € V, then (x,y) € W. For such (x,y), F(x, y) = 0 implies 
H(x,y) = (x,0). Since H : W > Z is a bijection, (x,0) € Z and 
H~'(x, 0) = (x, y). Comparing the last m components give 


y = V(x, 0) = G(x). 


Chapter 5. The Inverse and Implicit Function Theorems 320 


Conversely, since H(H~'(u, v)) = (u,v) for all (u,v) € Z, we find that 
(®(u, v), F(®(u, v), Y( 

for all (u,v) € Z. For all u € U, (u,0) is in Z. Therefore, 
®(u,0)=u, F(®(u,0), Y(u,0)) =0. 


This implies that if x € U, then F(u, G(u)) = 0. In other words, if (x, y) 
isin U x V and y = G(x), we must have F(x, y) = 0. Since we have 


shown that G : U > R” is continuously differentiable, the formula 
D,.F(x, G(x)) + Dy F(x, G(x))DG(x) = 0 


follows from F(x, G(x)) = 0 and the chain rule. 


Example 5.11 


Consider the system of equations 


Qa7y + 3ry*u+ cyv + uv =7 en 
Acu — 5yv+u°yt u's = 1 


Notice that when (x, y) = (1, 1), (u,v) = (1, 1) is a solution of this system. 
Show that there are neighbourhoods U and V of (1,1), and a continuously 


differentiable function G : U — R? such that if (z,y,u,v) € U x V, 


then (x,y, u,v) is a solution of the system of equations above if and only 


OG 
if u = Gy(z,y) and v = G2(z,y). Also, find the values of —~(1, 1), 


Ox 
OG, OG> OG2 
— (1,1), —(1,1 —(1,1). 
Fp (be De Gz (tat) and (1,1) 


Solution 
Define the function F : R* + R? by 


F(a, y, u,v) = (22°y+3ay?u+cyv+uv—7, 4cu—5yu+u2y+vu72—1). 


Chapter 5. The Inverse and Implicit Function Theorems 321 


This is a polynomial mapping. Hence, it is continuously differentiable. It 
is easy to check that F(1,1,1,1) = 0. Now, 


B8ry27+u 8 =sytu 


Da nhc. Ua. oO — 
oF (ay ) 4x +2uy —dy + 2uxr 


4 
det Da.» F(1, 1,1,1) = ; | = —244 0. 


By implicit function theorem, there are neighbourhoods U and V of (1, 1), 


and a continuously differentiable function G : U — R? such that, if 


(x,y, u,v) € UXxV, then (x, y, u, v) is a solution of the system of equations 
(5.7) if and only if u = G(x, y) and v = Go(z, y). 
Finally, 


Ary + 3y?ut yu 2x74 6ryut+ xv 


DeayF(z, y, u,v) = Au +? Fue we 


y) 


8 9 
Dey F(1,1,1,1) = [ | 


Chain rule gives 


DGO,1)——Do nko Dera ns) 


_ 1 |-3 -2//8 9 
T2026 Aa NS a 
dy |= 34 219 
AN = 


Therefore, 


Chapter 5. The Inverse and Implicit Function Theorems 322 


Remark 5.3 The Rank of a Matrix 


In the formulation of the implicit function theorem, the assumption that 
det DyF (xo, yo) # 0 can be replaced by the assumption that there are m 
variables u1,..., Um among the n+™m variables 71,...,2%n,Yi,---;Ym Such 
that det Dfay,...um)E (Xo, Yo) F 0. 

Recall that the rank r of an m x k& matrix A is the dimension of its row 
space or the dimension of its column space. Thus, the rank r of am x k 
matrix A is the maximum number of column vectors of A which are linearly 
independent, or the maximum number of row vectors of A that are linearly 
independent. Hence, the maximum possible value of r is max{m,k}. If 
r = max{m, k}, we say that the matrix A has maximal rank. Fora m x k 


matrix where m < k, it has maximal rank if r = m. In this case, there is 


am x m submatrix of A consists of m linearly independent vectors in R™. 
The determinant of this submatrix is nonzero. 

Thus, the condition det Dy F(xo, yo) # 0 in the formulation of the implicit 
function theorem can be replaced by the condition that the m x (m + n) 


matrix DF (xo, yo) has maximal rank. 


Example 5.12 


Consider the system 


Qa*y + 82y°u + cyv + uv = 7 


(5.8) 
4cu — 5yu +u-y tue = 1 


defined in Example 5.11. Show that there are neighbourhoods U and V of 
(1,1), and a continuously differentiable function H : V > IR? such that if 


(x, y,u,v) € UXV, then (2, y, u, v) is a solution of the system of equations 
if and only if « = Hy(u,v) and y = Ao(u,v). Find DH(1, 1). 


Chapter 5. The Inverse and Implicit Function Theorems 323 


Solution 


Define the function F : R* > R? as in the solution of Example 5.11. Since 


oe 8 
det Dy jC, Meet Lf ; | = —77 £0, 


the implicit function theorem implies there are neighbourhoods U and V of 
(1,1), and a continuously differentiable function H : V — R? such that if 


(x,y, u,v) € UXxV, then (x, y, u, v) is a solution of the system of equations 
(5.8) if and only if « = H,(u,v) and y = H2(u, v). Moreover, 


DAG —]—De kU Lie Deka.) 


als elle 


_ i Hie ie 
eile ee 


Remark 5.4 


The function G : U + R? in Example 5.11 and the function H : V — R? 
in Example 5.12 are in fact inverses of each other. 


Notice that DG(1, 1) is invertible. By the inverse function theorem, there is 
a neighbourhood U’ of (1,1) such that V’ = G(U) is open, and G : U' > 
V’ is a bijection with continuously differentiable inverse. By shrinking 
down the sets U and V, we can assume that U = U'’, and V = V’. If 
(x,y) € U and (u,v) € V, F(z,y,u,v) = 0 if and only if (u,v) = 
G(z, y), if and only if (x, y) = H(u, v). This implies that G : U > V and 
H: V — U are inverses of each other. 


At the end of this section, let us consider a geometric application of the implicit 
function theorem. First let us revisit the example where f(x,y) = 77+ y?—1. At 
each point (xo, yo) such that f (xo, yo) = 0, 


ry ty =. 


Hence, Vf (20, Yo) = (2%0,2y0) #4 O. Notice that the vector Vf(2o0, yo) = 


Chapter 5. The Inverse and Implicit Function Theorems 324 


(2279, 2yo) is normal to the circle x? + y* = 1 at the point (9, yo). 
J 
A 


Xo 


= Xx 


Figure 5.6: The tangent vector and normal vector at a point on the circle x? + y? — 
l=, 


If yo > 0, let U = (—1,1) x (0,00). Restricted to U, the points where 
f(a, y) = 0 is the graph of the function g : (—1,1) > R, g(x) = V1 — 2?. 

If yo < 0, let U = (—1,1) x (—oo,0). Restricted to U, the points where 
f(a, y) = 0 is the graph of the function g : (—1,1) > R, g(x) = —V1 — 2. 

If yo = 0, then x = 1 or —1. In fact, we can consider more generally the 


cases where x9 > 0 and % < 0. 
If zo > 0, let VU = (0,00) x (-1,1). 
f(x,y) = 0 is the graph of the function g : 
If to < 0, let U = (—oo,0) x (-1,1 
f(x,y) = 0 is the graph of the function g : 


Restricted to U, the points where 


—1,1) > R, g(y) = V1 —-y?. 


. Restricted to U, the points where 


—1,1) +R, 9y) =—V1—9". 


—~n— 


YS 


—. 


Definition 5.4 Surfaces 


Let S be a subset of R* for some positive integer k. We say that S is a 
n-dimensional surface if for each xp on S, there is an open subset D of 


R”, an open neighbourhood U/ of xo in R*, and a one-to-one differentiable 
mapping G : D > R* such that G(D) c S, G(D) NU = SNU, and 
DG(u) has rank n at each u € D. 


Chapter 5. The Inverse and Implicit Function Theorems 325 


Example 5.13 
We claim that the n-sphere 
SP = Cea Uae) a sor + OR + oF 44 = 1} 


is an n-dimensional surface. Let (a1,...,@n,@n41) be a point on S”. Then 
at least one of the components a1, ..., @,, @,41 1S nonzero. Without loss of 


generality, assume that a,,,; > 0. Let 


D> {Gi .52,) (e+ az, < 1), 


Ubi tae Oe eae 


and define the mapping G : D > U by 


Gee (21,.-. stm Y= af ak). 


Then G is a differentiable mapping, G(D) C S" and G(D) NU = S"NU. 
Now, 


Vv 


DEG =.un) = i ; 


where v = VGnii(@1,..-,2%n). Since the first n-rows of DG(a1,..., Zn) 
is the n x n identity matrix, it has rank n. Thus, S” is an n-dimensional 
surface. 


Generalizing Example 5.13, we find that a large class of surfaces is provided 
by graphs of differentiable functions. 


Theorem 5.12 


Let D be an open subset of R”, and let g : D — R be a differentiable 
mapping. Then the graph of g given by 


ce a ets Dob sleet | | ice eae an) ‘S ey = Clear -o.9 ais 


is an n-dimensional surface. 


A hyperplane in R"*? is the set of points in R”*' which satisfies an equation 


Chapter 5. The Inverse and Implicit Function Theorems 326 


of the form 


A,X, +++ + AnLy Qn41tn+1 = b, 


where a = (@1,..., Qn, @n41) iS a nonzero vector in Rt! By definition, if u and 


v are two points on the plane, then 
(a,u—v) =0. 


This shows that a is a vector normal to the plane. 


When D is an open subset of R”, and g : D — Ris a differentiable mapping, 
the graph G, of g is an n-dimensional surface. If u = (u1,..., Un) is a point on 
D, (u, g(u)) is a point on G,, we have seen that the equation of the tangent plane 
at the point (u, g(u)) is given by 


n 


tus = SC) + 52 (u, 9(u)) (2 — 0). 


i=1 
Implicit function theorem gives the following. 


Theorem 5.13 


Let O be an open subset of R"*', and let f : O > R be a continuously 


differentiable function. If xo is a point in O such that f(x9) = O and 
V f(xo) 4 0, then there is neighbourhood U of xo contained in O such 
that restricted to U, f(x) = 0 is the graph of a continuously differentiable 


function g : D — R, and V f(x) is a vector normal to the tangent plane of 
the graph at the point x. 


Assume that xp = (@1,...,@n,@n41). Since Vf(xo) 4 0, there isal < 

k <n+1 such that Da, 0) + (). Without loss of generality, assume that 
Xk, 

k=n+1. 


Chapter 5. The Inverse and Implicit Function Theorems 327 


Given a point x = (2),...,£n)fnai) in R™™, let u = (24,..., 2,) so that 
x = (U, 2,41). By the implicit function theorem, there is a neighbourhood 
D of up = (a1,...,d,), an r > O, and a continuously differentiable 
function g : D — R such that if U = D x (daii — T,dni1 + 1), 
(U,Unsi) € U, then f(u, up.) = 0 if and only if u,4; = g(u). In other 


words, in the neighbourhood U of xo = (Uo, @n+1), f(U, Un41) = 0 if and 
only if (u, v,41) is a point on the graph of the function g. The equation of 
the tangent plane at the point (u, u,,+1) is 


Lnti — Unti = 


By chain rule, 


Og 7 
Aig! =o 


een (u, Un+1) 


Hence, the equation of the tangent plane can be rewritten as 


n+1 


Yer =u) SE (a, ms) =0 


i=1 


This shows that V f(u, un+1) is a vector normal to the tangent plane. 


Example 5.14 


Find the equation of the tangent plane to the surface x” + 4y? + 9z? = 36 
at the point (6,1, —1). 


Solution 
Let f(@,y,z) = 2? + 4y? + 927. Then Vf (az,y,z) = (22, 8y, 182). It 
follows that V f(6, 1, -1) = 2(6, 4, —9). Hence, the equation of the tangent 
plane to the surface at (6, 1, —1) is 


62 + 4y —9z = 364+44+9=49. 


Chapter 5. The Inverse and Implicit Function Theorems 328 


Exercises 5.3 
Question 1 
Consider the equation 
eee ee 


Show that in a neighbourhood of (—1,1,2), this equation defines z as 
a function of (z,y). If this function is denoted as z = g(x,y), find 
Vg(-1, 1). 


Question 2 
Consider the system of equations 


Qeu? + vyz + 3uv = 2 
5a + Tyzu —v? = 1 


(a) Show that when (x, y, z) = (—1,1,1), (u,v) = (1,1) is a solution of 
this system. 


(b) Show that there are neighbourhoods U and V of (—1,1, 1) and (1, 1), 
and a continuously differentiable function G : U — R? such that, if 


(x,y, 2,u,v) €U x V, then (x, y, z, u, v) is a solution of the system of 
equations above if and only if u = G, (2, y, z) and v = G2(z, y, z). 


OG OG» 
5 3, (oh iL. i) and a, (oh 1, i 


OG 
(c) Find the values of aa lees) 
a 


Question 3 


Let O be an open subset of R?”, and let F : O — R” be a continuously 


differentiable function. Assume that xo and yo are points in IR” such that 
(Xo, Yo) is a point in O, F(xo, yo) = 0, and Dy F (xo, yo) and Dy F(xo, yo) 
are invertible. Show that there exist neighbourhoods U and V of xo and yo, 
and a continuously differentiable bijective function G : U — V such that, 
if (x,y) isinU x V, F(x, y) = Oif and only if y = G(x). 


Chapter 5. The Inverse and Implicit Function Theorems 329 


5.4 Extrema Problems and the Method of Lagrange Multipliers 


Optimization problems are very important in our daily life and in mathematical 


sciences. Given a function f : © — R, we would like to know whether it has 
a maximum value or a minimum value. In Chapter 3, we have dicusssed the 
extreme value theorem, which asserts that a continuous function that is defined 


on a compact set must have maximum and minimum values. In Chapter 4, we 


showed that if a function f : ® — R has (local) extremum at an interior point xo 
of its domain 9 and it is differentiable at xo, then xo must be a stationary point. 
Namely, Vf (x0) = 0. 

Combining these various results, we can formulate a strategy for solving a 
special type of optimization problems. Let us first consider the following example. 


Example 5.15 
Let 


K = {(x,y) |x? + 4y’ < 100}, 


and let f : K — R be the function defined as 


f(x,y) =) +y?. 


Find the maximum and minimum values of f : K — R, and the points 
where these values appear. 


Solution 
Let g : R* > R be the function defined as g(x, y) = x? + 4y” — 100. It is 
a polynomial function. Hence, it is continuous. Since K = g~'((—00, 0}) 


and (—oo, 0] is closed in R, K is a closed set. By a previous exercise, 


Omi — ae a 00 


C—bdik — {aya -4y- 100) 


Chapter 5. The Inverse and Implicit Function Theorems 330 


For any (z,y) € K, ||(x,y)||? = 2? + y? < x + 4y? < 100. Therefore, 
K is bounded. Since K is closed and bounded, and the function f : kK — 
R, f(x) = x? + y’ is continuous, extreme value theorem says that f has 


maximum and minimum values. These values appear either in O or on C. 


Since f : O > R is differentiable, if (2, yo) is an extremizer of f : O > 


R, we must have Vf (xo, yo) = (0,0), which gives (9, yo) = (0,0). 
The other candidates of extremizers are on C. Therefore, we need to find 


the maximum and minimum values of f(x,y) = x? + y? subject to the 
constraint 7? + 4y? = 100. From x? + 4y? = 100, we find that 7? = 
100 — 4y?, and y can only take values in the interval [—5,5]. Hence, we 


want to find the maximum and minimum values of h : [—5, 5] > R, 


h(y) = 100 — 4y? + y? = 100 — 3y”. 


When y = 0, h has maximum value 100, and when y = +5, it has minimum 
value 100 — 3 x 25 = 25. Notice that when y = 0, x = +10; while when 
= acd), we = 0. 

Hence, we have five candidates for the extremizers of f. Namely, u; = 
(0,0), ug = (10,0), us = (—10,0), uz = (0,5) and us = (0,—5). The 
function values at these 5 points are 


f(u1) =0, flu) = f(us) = 100,  f(us) = f(us) = 25. 


Therefore, the minimum value of f : K — Ris 0, and the maximum value 


is 100. The minimum value appears at the point (0,0) € int A’, while the 
maximum value appears at (+10,0) € bd Kk. 


Example 5.15 gives a typical scenario of the optimization problems that we 
want to study in this section. 


Chapter 5. The Inverse and Implicit Function Theorems 331 


Figure 5.7: The extreme values of f(x,y) = x? + y? on the sets K = 
{(x,y) |x? + 4y? < 100} andC = {(2,y) | x? + 4y? = 100}. 


Optimization Problem 


Let K be a compact subset of R” with interior O, and let f : K > R 
be a function continuous on K, differentiable on O. We want to find the 


maximum and minimum values of f : K — R. 


(i) By the extreme value theorem, f : K — R has maximum and 


minimum values. 


(ii) Since K is closed, K is a disjoint union of its interior O and its 
boundary C. Since C is a subset of K, it is bounded. On the other 
hand, being the boundary of a set, C is closed. Therefore, C is 
compact. 


(iii) The extreme values of f can appear in O or on C. 


(iv) If xo is an extremizer of f : K — R and it is in O, we must have 


V f (xo) = 0. Namely, xo is a stationary point of f : O > R. 


(v) If xo is an extremizer of f : K — R and it is not in O, it is an 


extremizer of f:C > R. 


(vi) Since C is compact, f : C — R has maximum and minimum values. 


Chapter 5. The Inverse and Implicit Function Theorems S32 


Therefore, the steps to find the maximum and minimum values of f : kK —> 


R are as follows. 


Step 1 Find the stationary points of f : O > 


Step 2 Find the extremizers of f :C > R. 


Step 3 Compare the values of f at the stationary points of f : O — R and 


the extremizers of f : C — R to determine the extreme values of 
f: KOR. 


Of particular interest is when the boundary of K can be expressed as g(x) = 0, 


where g : D — R is a continuously differentiable function defined on an open 
subset D of R”. If f is also defined and differentiable on D, the problem of 
finding the extreme values of f : C + R becomes finding the extreme values of 


f : D — R subject to the constraint g(x) = 0. In Example 5.15, we have used 


g(x) = 0 to solve one of the variables in terms of the others and substitute into f to 
transform the optimization problem to a problem with fewer variables. However, 
this strategy can be quite complicated because it is often not possible to solve 
one variable in terms of the others explicitly from the constraint g(x) = 0. The 
method of Lagrange multipliers provides a way to solve constraint optimization 
problems without having to explicitly solve some variables in terms of the others. 
The validity of this method is justified by the implicit function theorem. 


Theorem 5.14 The Method of Lagrange Multiplier (One Constraint) 


Let O be an open subset of R"*' and let f : O ~ Randg: O + Rbe 
continuously differentiable functions defined on O. Consider the subset of 
O defined as 


CeO) qxj—0)s 


If xo is an extremizer of the function f : C — R and Vg(xo) 4 0, then 


there is a constant A, known as the Lagrange multiplier, such that 


Vf (Xo) = AV (Xo). 


Chapter 5. The Inverse and Implicit Function Theorems 333 


Without loss of generality, assume that xo is a maximizer of f : C > 
Namely, 
CSS EG for all x € C. (5.9) 


Given that Vg(xo) # O, there exists al < k < m+ 1 such that 


(xo) # 0. Without loss of generality, assume that k = n + 1. 
Tk 


et xp = (Gi, -.0nGra1). Given a pomt x — (77,277. en1) in 


R"*! Jet u = (21,...,2%n) So that x = (U, 2,41). By implicit function 


theorem, there is a neighbourhood D of up = (aj,...,@,), anr > 0, 


and a continuously differentiable function h : D — R such that for 


(U,2n41) € D X (GQn41 — 7,4n41 + 7), g(u,%n4i1) = O if and only if 
In+1 = h(u). Consider the function F' : D — R defined as 


By (5.9), we find that 


F (uo) > F(u) for all u € D. 


In other words, Up is a maximizer of the function F' : D — R. Since up 


is an interior point of D and F' : D — R is continuously differentiable, 
VF (uo) = 0. Since F(u) = f(u, h(u)), we find that for 1 <i <n, 


OF Of of Oh 


Dz, U0) = Fiz, (Hor Gnt1) ott Dig gy 1 Ont) Gy, U0) =0. (5.10) 


L 


On the other hand, applying chain rule to g(u, h(u)) = 0 and set u = uo, 
we find that 


O O 
ao (Wits Gata) oe z forl<i<n. 
Me iG 


Ox; 
(5.11) 


By assumption, 


Chapter 5. The Inverse and Implicit Function Theorems 334 


Eqs. (5.12) and (5.13) together imply that 
V f (Xo) = AVg(Xo). 


This completes the proof of the theorem. 


Remark 5.5 


Theorem 5.14 says that if xo is an extremizer of the constraint optimization 
problem max / min f(x) subject to g(x) = 0, then the gradient of f at xo 
should be parallel to the gradient of g at xo if the latter is nonzero. One can 
refer to Figure 5.7 for an illustration. Recall that the gradient of f gives 
the direction where f changes most rapidly, while the gradient of g here 
represents the normal vector to the curve g(x) = 0. 

Using the method of Lagrange multiplier, there are n + 2 variables 
L1,---,Ln41 and X to be solved. The equation V f(x) = AVg(x) gives 
n + 1 equations, while the equation g(x) = 0 gives one. Therefore, we 
need to solve n + 2 variables from n + 2 equations. 


Example 5.16 


Let us solve the constraint optimization problem that appears in Example 


5.15 using the Lagrange multiplier method. Let f : R? — Rand gq: 


R? — R be respectively the functions f(x,y) = 2? + y? and g(x,y) = 
x? + 4y? — 100. They are both continuously differentiable. We want to find 
the maximum and minimum values of the function f(x,y) subject to the 


constraint g(x,y) = 0. Notice that Vg(z, y) = (22, 8y) is the zero vector 
if and only if (x,y) = (0,0), but (0,0) is not on the curve g(x,y) = 0. 
Hence, for any (x, y) satisfying g(x, y) = 0, Vg(x,y) £ 0. 


Chapter 5. The Inverse and Implicit Function Theorems 335 


By the method of Lagrange multiplier, we need to find (x, y) satisfying 


Vif(z,y) =AVg(az,y) and g(x,y) =0. 


Therefore, 
2 = 2 NG, 20 NU: 


This gives 
el) = 0, -yt— 4) =0: 


The first equation says that either x = 0 or A = 1. 

If x = 0, from x? + 4y? = 100, we must have y = +5. 

If \ = 1, then y(1 — 4) = 0 implies that y = 0. From x? + 4y? = 100, we 
then obtain x = +10. 

Hence, we find that the candidates for the extremizers are (+10,0) and 
(0,45). Since f(+10,0) = 100 and f(0,+5) = 25, we conclude that 
subject to x? + 4y? = 100, the maximum value of f(x, y) = 2? + y” is 100, 
and the minimum value of f(x,y) = x? + y? is 25. 


Example 5.17 


Use the Lagrange multiplier method to find the maximum and minimum 
values of the function f(x, y, z) = 8x + 24y + 27z on the set 


= { (x,y, 2) | a? + 4y? + 92? = 2391, 


and the points where each of them appears. 


Solution 
R be the function 


Vey.) =a p4y 4027 — 289. 


The functions f : R? > R, f(x,y, z) = 8v + 24y + 27z andg : R® 51 
are both continuously differentiable. 


Chapter 5. The Inverse and Implicit Function Theorems 336 


Notice that V g(a, y, z) = (2, 8y, 18z) = Oif and only if (xz, y, z) = 0, and 
0 does not lie on S. By Lagrange multiplier method, to find the maximum 


and minimum values of f : S — R, we need to solve the equations 


Vi(z,y,2z) =AVg(a,y,z) and g(x,y, z) =0. 
These give 


eee eke PA Cana) el. 
eA oe 80) 


To satisfy the first three equations, none of the A, x, y and z can be zero. 


We find that 3 


pot — EL. = SS 
ea Dr 
Substitute into the last equation, we have 


4 — 


64+ 144+ 81 


D2 a, 


1 
This gives 4\? = 1. Hence, \ = +5. When A = -, (x,y,z) = (8, 6,3). 


When \ = my (x,y,z) = (—8, —6, —3). These are the two candidates for 
the extremizers of f : S — R. 
Since f(8,6,3) = 289 and f(—8,—6,—3) = -—289, we find that the 
maximum and minimum values of f : S — R are 289 and —289 


respectively, and the maximum value appear at (8, 6, 3), the minimum value 
appear at (—8, —6, —3). 


Now we consider more general constraint optimization problems which can 


have more than one constraints. 


Chapter 5. The Inverse and Implicit Function Theorems 337 


Theorem 5.15 The Method of Lagrange Multiplier (General) 


Let O be an open subset of R”*” and let f :O + RandG: O > R™ be 
continuously differentiable functions defined on O. Consider the subset of 
O defined as 

C— {xe Ol 1G(x)— 0), 


If Xo is an extremizer of the function f : C — R and the matrix DG(xo) 


has (maximal) rank m, then there are constants \;, ..., A, known as the 
Lagrange multipliers, such that 


Vf (xo) = DE DEG (can), 


Without loss of generality, assume that x9 is a maximizer of f : C > 


Namely, 
HOSS eS! for allx € C. (5.14) 


Given that the matrix DG(xo) has rank m, m of the column vectors are 
linearly independent. Without loss of generality, assume that the column 
vectors in the last m columns are linearly independent. Write a point x in 


R™+*” as x = (u,v), where u = (u1,..., Un) isin R” and v = (v,..-, Um) 


is in R™. By our assumption, D,G(upo, vo) is invertible. By implicit 
function theorem, there is a neighbourhood D of uo, a neighbourhood VY 


of vo, and a continuously differentiable function H : D — R"” such that 
for (u,v) € D x V, G(u, v) = 0 if and only if v = H(u). Consider the 
function F’ : D — R defined as 


F(u) = f(u, H(u)). 
By (5.14), we find that 


F (uo) > F(u) for all u € D. 


Chapter 5. The Inverse and Implicit Function Theorems 338 


In other words, Ug is a maximizer of the function Ff’ : D — R. Since ug 


is an interior point of D and F' : D — R is continuously differentiable, 
VF (uo) = 0. Since F(u) = f(u, H(u)), we find that 


VF (uo) = Duf (uo, Vo) + Dy f (uo, Vo) DH (uo) = (0) (5.15) 


On the other hand, applying chain rule to G(u, H(u)) = 0 and set u = uo, 
we find that 


D,G(up, Vo) =P D,G(ug, Vo) DH (ug) = 0. (5.16) 


Take 
[dn oe | N= Def GD eG Gale! 


Then 
Dy f (Xo) = ADYG(xo). 


Eqs. (5.15) and (5.16) show that 
Duf (Xo) = —AD,G(x_))DH (up) = AD,.G(xo). 


Eqs. (5.17) and (5.18) together imply that 


Vf (xo) = ADG(xp) = De AGES). 


This completes the proof of the theorem. 


In the general constraint optimization problem proposed in Theorem 5.15, 
there are n + 2m variables u,,...,Un, V1,...,Um and A1,...,Am to be solved. 
The components of 


Vf (x) = S NV Gi(x) 


give n + m equations, while the components of G(x) = O give m equations. 
Hence, we have to solve n + 2m variables from n + 2m equations. Let us look at 


an example. 


Chapter 5. The Inverse and Implicit Function Theorems 339 


Example 5.18 


Let K be the subset of R® given by 


K={(a,y,2) |e? +y <4e+y+2=1}. 


Find the maximum and minimum values of the function f : K — | 
f(t,y,z) =x+3y+4 z. 


Solution 
Notice that AK is the intersection of the two closed sets K,; = 
{(x,y, 2) |? + y? < 4} and Ko = {(z,y,z)|z+y+z=1}. Hence, K 
is a closed set. If (x, y, z) isin K, x? + y? < 4. Thus, |2| < 2, |y| < 2 and 
hence |z| < 1+ |x| + |y| < 5. This shows that Kv is bounded. Since Kc is 
closed and bounded, f : K — R is continuous, f : K — R has maximum 


and minimum values. 
Let 


D={(z,y,2)|2+y? <4c+y+z2=1}, 


CG \a-+y 4a ye — 


Then K = C'U D. We can consider the extremizers of f : D — R and 
f : C — R separately. 
To find the extremizers of f : D — R, we can regard this as a constraint 


optimization problem where we want to find the extreme values of f : O > 
R, f(z,y,z) =x+3y+zon 


O= (eee ar <4}, 


subject to the constraint g(x,y, z) = 0, where g : O — R is the function 
g(x,y, 2) =x +y+2-1. Now Va(z,y, z) = (1,1, 1) # 0. Hence, at an 
extremizer, we must have V f(x,y, z) = Ag(x, y, z), which gives 


Ge351)— Ae). 


Chapter 5. The Inverse and Implicit Function Theorems 340 


This says that the two vectors (1, 3, 1) and (1, 1,1) must be parallel, which 
is a contradiction. Hence, f : O — R does not have extremizers. 


Now, to find the extremizers of f : C — R, we can consider it as finding 


the extreme values of f : R® > R, f(z,y,z) = x + 3y + z, subject to 


G(z, y, z) = 0, where 


G(z,y,2) =(2 +y?-4,0+y+2-1). 


dle ba | 


DEG) | 


Qe 2y | 


This matrix has rank less than 2 if and only if (27,2y,0) is parallel to 
(1,1,1), which gives x = y = z = 0. But the point (x,y, z) = (0, 0,0) is 
not on C.. Therefore, DG(, y, z) has maximal rank for every (x,y,z) € 
C’. Using the Lagrange multiplier method, to solve for the extremizer of 


f : C —R, we need to solve the system 
Vi (ty, 2) = AVGi (x,y, 2) + wGo(z,y,z), G(z,y,z) =0. 
These gives 


i 2 i tS A, tee 
dg A are 


From pp = 1, we have 2\x = 0 and 2Ay = 2. The latter implies that \ # 0. 
Hence, we must have x = 0. Then x? + y? = 4 gives y = +2. When 
(x,y) = (0,2), z = —l. When (xz, y) = (0,—2), z = 3. Hence, we only 
have two candidates for extremizers, which are (0,2,—1) and (0, —2, 3). 
Since 

FO) 0 ae OR ot 


we find that f : A — R has maximum value 5 at the point (0,2, —1), and 


minimum value —3 at the point (0, —2, 3). 


Chapter 5. The Inverse and Implicit Function Theorems 341 


Exercises 5.4 


Question 1 


Find the extreme values of the function f(z, y, z) = 477+ y?+ yz+ 2? on 
the set 
S={(z,y,2z) |e’? +y +27 <8}. 


Question 2 
Find the point in the set 

S = {(x,y)| 407+ y? < 36,27 + 4y* > 4} 
that is closest to and farthest from the point (1, 0). 


Question 3 


Use the Lagrange multiplier method to find the maximum and minimum 
values of the function f(x,y, z) = x + 2y — z on the set 


S = {(2,y,z)|a*+y? +42? < 84}, 


and the points where each of them appears. 


Question 4 


Find the extreme values of the function f(z, y, z) = x on the set 


S={(@,y,z)|e@ = +2, fe + 3y+42=60} . 


Question 5 


Let K be the subset of R® given by 
KS {ay 2) aro 68 ye — 10h 


Find the maximum and minimum values of the function f : K — R, 
f(x,y, z) = @ + 2y. 


Chapter 5. The Inverse and Implicit Function Theorems 342 


Question 6 


Let A be an n x n symmetric matrix, and let Q4 : R” — R be the quadratic 
form Q(x) = x" Ax defined by A. Show that the minimum and maximum 


values of Q4 : S”-! — R on the unit sphere S”~! are the smallest and 


largest eigenvalues of A. 


Chapter 6. Multiple Integrals 343 


Chapter 6 
Multiple Integrals 


For a single variable functions, we have discussed the Riemann integrability of 


a function f : [a,b] — R defined on a compact interval [a,b]. In this chapter, 


we consider the theory of Riemann integrals for multivariable functions. For a 


function F :‘® — R” that takes values in R” with m > 2, we define the integral 


componentwise. Namely, we say that the function F : 9 — R” is Riemann 


integrable if and only if each of the component functions fF; :D > R,1l<j <m 


is Riemann integrable, and we define 


fe-([afa. fm). 


Thus, in this chapter, we will only discuss the theory of integration for functions 
f : 2 — R that take values in R. 
A direct generalization of a compact interval |a, | to R” is a product of compact 

n 


intervals I = [[la: b;], which is a closed rectangle. In this chapter, when we say I 


i=1 
n 


is arectangle, it means I can be written as ] [la b;| with a; < b; foralll <i<n. 


i=1 
n 


The edges of I = ] [la b;] are [ay, by], [ae, be], ..., [an, On]. 


i=1 
We first discuss the integration theory of functions defined on closed rectangles 


n 


of the form ] [la b;|. For applications, we need to consider functions defined on 


i=1 
other subsets D of R”. 
One of the most useful theoretical tools for evaluating single integrals is the 


fundamental theorem of calculus. To apply this tool for multiple integrals, we 
need to consider iterated integrals. Another useful tool is the change of variables 
formula. For multivariable functions, the change of variables theorem is much 


more complicated. Nevertheless, we will discuss these in this chapter. 


Chapter 6. Multiple Integrals 344 


6.1 Riemann Integrals 


In this section, we define the Riemann integral of a function f : © — R defined 


on a subset D of R”. We first consider the case where D = ] [la bi]. 
t-1 
Let us first consider partitions. We say that P = {o,21,..., v;,} is a partition 


of the interval |a, b] if a = x < 4 < +--+ < Up-1 < Xp = DB. It divides |a, b] into 
k subintervals J,,..., Jy, where J; = [xj-1, x]. 


Definition 6.1 Partitions 


n 


A partition P of a closed rectangle I = ] [la bi] is achieved by having a 


partition P; of [a;,b;] foreach 1 <i < Pie Waite be =(( 77 Poa) 
for such a partition. The partition P divides the rectangle I into a collection 
Jp of rectangles, any two of which have disjoint interiors. A closed 
rectangle J in 7p can be written as 


J) = dh % da % 22> & dh, 


where J;, 1 < 7 <n isa subinterval in the partition P;. 


If the partition P; divides [a;, b;| into k; subintervals, then the partition P = 


(P,,..., P,,) divides the rectangle I = ] [la:. bj] into | Sp| = kik2--+k, rectangles. 
4=1 


Example 6.1 


Consider the rectangle I = [—2,9] x [1,6]. Let P, = {—2,0,4,9} and 
P, = {1,3,6}. The partition P, divides the interval J, = [—2,9] into 
the three subintervals [—2, 0], [0, 4] and [4,9]. The partition P, divides the 
interval [2 = [1,6] into the two subintervals [1,3] and [3,6]. Therefore, 


the partition P = (P,, P2) divides the rectangle I into the following six 


rectangles. 


Chapter 6. Multiple Integrals 345 


Figure 6.1: A partition of the rectangle [—2, 9] x [1,6] given in Example 6.1. 


Definition 6.2 Regular and Uniformly Regular Partitions 


n 


Let I = ] [la b;] be a rectangle in R”. We say that P = (P,,..., P,) is 


I 

a regular partition of Lif for each 1 < i < n, P,; is a regular partition of 
[a;, b;] into k; intervals. We say that P is a uniformly regular partition of P 
into k” rectangles if for each 1 < i < n, P; is a regular partition of [a;, b;] 
into k intervals. 
Example 6.2 
Consider the rectangle I = [—2, 7] x [—4, 8]. 
(a) The partition P = (P,,P,) where P, = {-2,1,4,7} and Pp = 

{—4, —1, 2,5, 8} is a regular partition of I. 


(b) The partition P = (P,,P)) where P,; = {-2,1,4,7} and P, = 


{—4, 0,4, 8} is a uniformly regular partition of I into 3? = 9 rectangles. 


The length of an interval [a,b] is b — a. The area of a rectangle [a, b] x [c, d] 
is (b—a) x (d —c). In general, we define the volume of a closed rectangle of the 


form I = ] [la b;] in R” as follows. 


i=1 


Chapter 6. Multiple Integrals 346 


regular uniformly regular 


Figure 6.2: A regular and a uniformly regular partition of [—2,7] x [—4,8] 
discussed in Example 6.2. 


Definition 6.3 Volume of a Rectangle 


n 


The volume of the closed rectangle I = ] [la bj] is defined as the product 
(=k 


of the lengths of all its edges. Namely, 


nm 


vol (I) = | [ (0: — ai). 


=| 


Example 6.3 


The volume of the rectangle I = [—2, 9] x [1, 6] is 


vel il 50, 


When P = {20,21,...,2%} is a partition of [a,b], it divides [a,b] into k 


subintervals J,,..., J, where J; = [aj_1, vj]. Notice that 
k k 
S_ vol (Ji) = So (ai = Li-1) =b—-a. 
=i i=1 


Assume that P = (P,,--- , P,) is a partition of the rectangle I = [ [la bj] in R”. 
i=1 
Then for 1 < i < n, P; is a partition of [a;,b;|. If P; divides [a;,b;] into the k; 


Chapter 6. Multiple Integrals 347 


subintervals J; 1, Ji2,...,Ji,x,, then the collection of rectangles in the partition P 
is 
JP =Adigna ee dns, | LS ty Si; for 1 24 < ih. 


Notice that 
vol (Jim, X °° X Inmy,) = VOL (Sim,) X -+* X VOL (Sn): 


From this, we obtain the sum of volumes formula: 


oS vol (J) = 3 vee > vol (Jim,) X +++ X VOL (Jn m,,) 


JETp Mma=1 my=1 
ky kn 
= [35 term] xo [35 stam) 
mi=1 Mn=1 


= (by — ay) X +++ X (bn — Gn) 


= vol (I). 


Proposition 6.1 


n 


Let P be a partition of I = ] [la b;|. Then the sum of the volumes of the 


i=1 
rectangles J in the partition P is equal to the volume of the rectangle I. 


One of the motivations to define the integral / f for a nonnegative function 


I 
f : I — Ris to find the volume bounded between the graph of f and the rectangle 


Tin R"**. To find the volume, we partition I into small rectangles, pick a point 


€, in each of these rectangles J, and approximate the function on J as a constant 
given by the value f(&€ ;). The volume between the rectangle J and the graph of 
f over J is then approximated by f(&,) vol (J). This leads us to the concept of 


Riemann sums. 
n 


If P is a partition of I = ] [la: bi], we say that A is a set of intermediate 


i=1 
points for the partition P if A = {& | J © Jp} is a subset of I indexed by Jp, 
such that €; € J foreach J € Jp. 


Chapter 6. Multiple Integrals 348 


Definition 6.4 Riemann Sums 


n 


ett — ] [la bi], and let f : I — R be a function defined on I. Given 


J=il 
a partition P of I, a set A = {€,;|J © Jp} of intermediate points for the 
partition P, the Riemann sum of f with respect to the partition P and the 
set of intermediate points A = {§€y} is the sum 


R(f,P,A) = > f(€s)vol (J). 


JETp 


Example 6.4 


Let I = [—2,9] x [1,6], and let P = (P,, P2) be the partition of I with 
P, = {-2,0,4,9} and P, = {1,3,6}. Let f : I > R be the function 
defined as f(x,y) = x? + y. Consider a set of intermediate points A as 


follows. 


f(€3) | vol (J) 
2 
3 


The Riemann sum R(f, P, A) is equal to 
2x4+3x6+2x«8+8 x 124+18 x 10+ 84 x 15 = 1578. 


Example 6.5 


If f : I > Ris the constant function f(x) = c, then for any partition P of 
I and any set of intermediate points A = {£5}, 


RP. A) = cvol (Lh: 


When c > 0, this is the volume of the rectangle I x [0, c] in] 


Chapter 6. Multiple Integrals 349 


As in the single variable case, Darboux sums provide bounds for Riemann 


sums. 


Definition 6.5 Darboux Sums 


n 


Let I = | [la. bi] , and let f : I + R be a bounded function defined on 


i=l 
I. Given a partition P of I, let Zp be the collection of rectangles in the 
partition P. For each J in Jp, let 

my = inf {f(x)|xeJ} and My; =sup{f(x)|xe€ J}. 


The Darboux lower sum L(f, P) and the Darboux upper sum U(f, P) are 
defined as 


L(f,P) = $5 my vol(J) and U(f,P)= S— My vol(J). 


JESp JEJp 


Example 6.6 


If f : 1 > Ris the constant function f(x) = c, then 


EC) evel) — Ui) for any partition P of I. 


Example 6.7 


Consider the function f : I > R, f(x,y) = x? + y defined in Example 
6.4, where I = [—2,9] x [1,6]. For the partition P = (P,, P2) with P, = 
{—2,0,4,9} and P, = {1,3,6}, we have the followings. 


my My vol (J) 
O?+1=1 | (-2)?+3=7 
0?+3=3 | (-2)?+6=10 
@P+1i=1|] 44+3=19 
0°7+3=3 | 47+6=22 
4+1=17| 9+4+3=84 
44+3=19| 9+6=87 


Chapter 6. Multiple Integrals 350 


Therefore, the Darboux lower sum is 


L(f,P) =1x4+3x64+1x84+3x 12417 x 10+19 x 15 = 521; 


while the Darboux upper sum is 


U(f,P) =7x44+10x6+19x 8+ 22 x 12+ 84x 104+ 87 x 15 = 2649. 


Notice that we can only define Darboux sums if the function f : I > R is 


bounded. This means that there are constants ™m and M such that 
m<f(z)<M for all x € I. 


If P is a partition of the rectangle I, and J is a rectangle in the partition P, €; is a 
point in J, then 
m<myj < f(€s) < My < M. 


Multipluying throughout by vol (J) and summing over J € Jp, we obtain the 
following. 


Proposition 6.2 


n 


bet IT = ] [la b;], and let f : I > R be a bounded function defined on I. 


j=l 


If 
m< f(x) <M for all x € I, 


then for any partition P of I, and for any choice of intermediate points 
A = {&;} for the partition P, we have 


mvol(I) < L(f,P) < R(f,P,A) < U(f,P) < Mvol(1). 


To study the behaviour of the Darboux sums when we modify the partitions, 


we first extend the concept of refinement of a partition to rectangles in R”. Recall 


that if P and P* are partitions of the interval |a, b], P* is a refinement of P if each 
partition point of P is also a partition point of P*. 


Chapter 6. Multiple Integrals 351 


Definition 6.6 Refinement of a Partition 


n 


ett — | [le:. ba andilete*— (Pa. ande (Pace. \ebe 


1=I 
partitions of I. We say that P* is a refinement of P if foreach 1 <i <n, 


P* is a refinement of P;. 


Figure 6.3: A refinement of the partition of the rectangle [—2, 9] x [1,6] given in 
Figure 6.1. 


Example 6.8 


Let us consider the partition P = (P,, P2) of the rectangle I = [—2,9] x 
[1,6] given in Example 6.1, with P; = {—2,0,4,9} and P, = {1,3,6}. 
Let P? = {—2,0,1,4,6,9} and Px = {1,3,4,6}. Then P* = (P¥, Px) is 
a refinement of P. 


If the partition P* is a refinement of the partition P, then for each J in 7p, P* 
induces a partition of J, which we denote by P*(J). 


Example 6.9 


The partition P* in Example 6.8 induces the partition P*(J) = 
(Pi (J), Pz (J)) of the rectangle J = (0, 4] x [3,6], where P*(J) = {0, 1, 4} 


and P3(J) = {3,4,6}. The partition P*(J) divides the rectangle J into 4 
rectangles, as shown in Figure 6.3. 


Chapter 6. Multiple Integrals 352 


If the partition P* is a refinement of the partition P, then the collection of 
rectangles in P* is the union of the collection of rectangles in P*(J) when J 
ranges over the collection of rectangles in P. Namely, 


Tex = U Te*(3): 


JETp 


Using this, we can deduce the following. 


Proposition 6.3 


n 


her l— [ [la , b;|, and let f : I + R be a bounded function defined on I. 


i=l 
If P and P* are partitions of I and P* is a refinement of P, then 


Lie \—= LEP), UGE) = > UG, Pd). 


JETp JETJp 


From this, we can show that a refinement improves the Darboux sums, in the 


sense that a lower sum increases, and an upper sum decreases. 


Theorem 6.4 


n 


Let I = ] [la b;], and let f : I > R be a bounded function defined on I. 


i=l 
If P and P* are partitions of I and P* is a refinement of P, then 


iP Sie Oe Suir). 


For each rectangle J in the partition P, 


my = fx My for all x € J. 


Applying Proposition 6.2 to the function f : J — R and the partition 
P*(J), we find that 


my vol (J) < L(f, P*(J)) < U(F, P*(J)) < Mg vol (J). 


Chapter 6. Multiple Integrals 353 


Summing over J € Jp, we find that 


L(f.P) < So LRP) < So UPD) < UEP). 


JESp JETp 


The assertion follows from Proposition 6.3. 


It is difficult to visualize the Darboux sums with a multivariable functions. 
Hence, we illustrate refinements improve Darboux sums using single variable 


functions, as shown in Figure 6.4 and Figure 6.5. 


y 
A 
x dll. 


Figure 6.4: A refinement of the partition increases the Darboux lower sum. 


y 
4 
=X Alls, 


0 0 


=< 


XS 


Figure 6.5: A refinement of the partition decreases the Darboux upper sum. 


As a consequence of Theorem 6.4, we can prove the following. 


Chapter 6. Multiple Integrals 354 


Corollary 6.5 


n 


Perl — ] [la. bi], and let f : I > R be a bounded function defined on I. 


ail 
For any two partitions P; and P, of I, 


LG Pi Ue), 


ee US ieteneae aie heel ene Oey Wee wr wcrer le Ielole dle 
a <n, let P* be the common refinement of P; ; and P2,; obtained by taking 
the union of the partition points in P,; and P,;. Then P* = (Py,..., P*) 
is acommon refinement of the partitions P; and Py. By Theorem 6.4, 


Now we define lower and upper integrals of a bounded function f: I> R. 


Definition 6.7 Lower Integrals and Upper Integrals 


n 


bet = ] [la: b;], and let f : I > R be a bounded function defined on I. 


= 


1 
Let S;,(f) be the set of Darboux lower sums of f, and let Sy(f') be the set 
of Darboux upper sums of f. 


1. The lower integral of f, denoted by i f, is defined as the least upper 
I 


bound of the Darboux lower sums. 


ft = sup S;(f) =sup{L/(f, P) | P is a partition of I} . 
AG 


2. The upper integral of f, denoted by ‘i f, 1s defined as the greatest lower 
I 


bound of the Darboux upper sums. 


[i = inf Su(f) = inf {U(f, P) | P is a partition of I} . 
I 


Chapter 6. Multiple Integrals 355 


Example 6.10 


R is the constant function f(x) = c, then for any partition P of 


L(f, P) = cvol (1) = U(f,P). 
Therefore, both S;(f) and Sy(f) are the one-element set {c vol (I)}. This 


shows that = 
[f = fs = cvol (I). 


For a constant function, the lower integral and the upper integral are the same. 


For a general bounded funtion, we have the following. 


Theorem 6.6 


n 


verl — ] [la bi], and let f : I > R be a bounded function defined on I. 


[rs fi 


f=) 
Then we have 


By Corollary 6.5, every element of S;(f) is less than or equal to any 
element of Sy(f). This implies that 


sup sr(f) < mi sp(f) = [i 


Lerl— ie b;], and let f : I > R be the function defined as 
i if all components of x are rational, 
otherwise. 


This is known as the Dirichlet’s function. Find the lower inegral and the 


upper integral of f: I — R. 


Chapter 6. Multiple Integrals 356 


Solution 
Let P = (P,,..., P,) be a partition of I. A rectangle J in the partition P 


can be written in the form J = I] [u;, vi]. By denseness of rational numbers 
j=l 
and irrational numbers, there exist a rational number a, and an irrational 


number (3; in (u;,v;). Let @ = (a1,...,@,) and @ = ((1,..., Gn). Thena 
and are points in J, and 


Os Gh ea er) al for all x € J. 
Therefore, 
my = inf f( ira 0h Vig oy cual 
xe xed 


It follows that 
Lf P= Ya viaivolis) 0) 
JEJp 
U(f,P) = S> Myvol(J) = S~ vol (J) = vol (1). 
JETp JEJp 
Therefore, 


Sx(f) = {0}, while Sy(f) = {vol (I)}. 


This shows that the lower inegral and the upper integral of f : I > 


given respectively by 


[1 =0 and fi = vol (I). 


As we mentioned before, one of the motivations to define the integral f : I > 


R is to calculate volumes. Given that f : I — R is a nonnegative continuous 


function defined on the rectangle I in R”, let 


S={(x,y)|xeL0<y< f(x)}, 


which is the solid bounded between I and the graph of f. It is reasonable to expect 
that S has a volume, which we denote by vol (S). We want to define the integral 


Chapter 6. Multiple Integrals 357 


| f so that it gives vol (S). Notice that if P is a partition of I, then the Darboux 
I 


lower sum 


L(f,P) = S© mj vol (J) 


JCIJp 


is the sum of volumes of the collection of rectangles 


{J x [0, mg] | J — Je} 


in R"*!, each of which is contained in 9. Since any two of these rectangles can 


only intersect on the boundaries, it is reasonable to expect that 
L(f,P) < vol (S). 
Similarly, the Darboux upper sum 


U(f,P) = S> Myvol(J) 


JEJp 


is the sum of volumes of the collection of rectangles 


{J x [0, Ms||J € Pe} 


in R"*!, the union of which contains S. Therefore, it is reasonable to expect that 


vol (S) < U(f,P). 


Hence, the volume of S should be a number between L(f,P) and U(f, P) for 
any partition P. To make the volume well-defined, there should be only one 
number between L(f,P) and U(f,P) for all partitions P. By definition, any 
number between the lower integral and the upper integral is in between L(f, P) 
and U(f, P) for any partition P. Hence, to have the volume well-defined, we must 
require the lower integral and the upper integral to be the same. This motivates 
the following definition of integrability for a general bounded function. 


Chapter 6. Multiple Integrals 358 


Definition 6.8 Riemann integrability 


n 


cet 1 — ] [la bi], and let f : I + R be a bounded function defined on I. 


i=1 
We say that f : I + Ris Riemann integrable, or simply integrable, if 


bed: 


In this case, we define the integral of f over the rectangle I as 


[r- fr- ft 


It is the unique number larger than or equal to all Darboux lower sums, and 
smaller than or equal to all Darboux upper sums. 


Example 6.12 


Example 6.10 says that a constant function f : I — | 


[i Ee voll 


integrable and 


Example 6.13 


The Dirichlet’s function defined in Example 6.11 is not Riemann integrable 
since the lower integral and the upper integral are not equal. 


Leibniz Notation for Riemann Integrals 


The Leibniz notation of the Riemann integral of f : I > Ris 


[ fesax, or equivalently, [Fle san)der do, 
I I 


As in the single variable case, there are some criteria for Riemann integrability 
which follows directly from the criteria that the lower integral and the upper 
integral are the same. 


Chapter 6. Multiple Integrals 359 


Theorem 6.7 


n 


Let I = ] [la b;], and let f : I > R be a bounded function defined on I. 


i=1 
The following are equivalent. 


(a) The function f : I > Ris Riemann integrable. 


(b) For every < > 0, there is a partition P of the rectangle I such that 


U(f,P) — L(f,P) <e. 


We define an Archimedes sequence of partitions exactly the same as in the 


single variable case. 


Definition 6.9 Archimedes Sequence of Partitions 


n 


eo ] [la b;], and let f : I > R be a bounded function defined on I. 


i=1 
If {P;,} is a sequence of partitions of the rectangle I such that 


lim’ (U(F, P.) — Lf, Pe)) = 0, 


k- oo 


we call {P;,} an Archimedes sequence of partitions for the function f. 


Then we have the following theorem. 


Theorem 6.8 The Archimedes-Riemann Theorem 


n 


Let I = ] [la:: b;|, and let f : I + R be a bounded function defined on 


i=1 
I. The function f : I — R is Riemann integrable if and only if f has an 


Archimedes sequence of partitions {P;,}. In this case, the integral / f can 
I 
be computed by 


I k- oo k- oo 


A candidate for an Archimedes sequence of partitions is the sequence {P;}, 


Chapter 6. Multiple Integrals 360 


where P, is the uniformly regular partition of I into &” rectangles. 


Example 6.14 


Let I = [0,1] x [0,1]. Consider the function f : I > R defined as 


iF if 
f@,y) = 
0, ike <u 


For k € Z*, let P;, be the uniformly regular partition of I into k? rectangles. 


(a) For each k € Z*, compute the Darboux lower sum L(f,P;,) and the 
Darboux upper sum U(f, P;,). 


(b) Show that f : I — Ris Riemann integrable and find the integral | ce 
I 


Solution 


Fixed k € Zt, let P, = {uo,1,..., Ux}, where u; = * for 0 Soe 
Then P;,, = (P;, Py), and it divides I = [0,1] x [0, 1] into the k? rectangles 


Jas L<a< k, L< J < ik. where Je = [ts—1, Us| x [uj_1, U5]. We have 


{) 


vol) — Re 


i — ell s(t) aCe Oa 
(x,y)EIi,5 (x,y)eJi,; 
Notice that if 7 < 7 — 1, then 
Ca ey for all (x,y) € Jj. 


Hence, 
Foy — 0) for all (a, y) € Jj. 


This implies that 


Mig = Mi = 0 when i < yj — 1. 


Chapter 6. Multiple Integrals 361 


Ifz > 3+ 1, then 
Ue ee Uy ay for all (x,y) € Jiy. 


Hence, 
a ane—oll for all (a, y) € Jj;. 


This implies that 
Wp when? >j + 1. 
When i = j — 1, if (z, y) is in Jij, 


ip SS really epee 


and x = y if and only if (x, y) is the point (u;, u;_1). Hence, f(x,y) = 0 
for all (x, y) € Ji,;, except for (x, y) = (u;, u;-1), where f(u;,uj;-1) = 1. 
Hence, 

mjy=0, M,;=1 wheni = 7-1. 


When i = j,0 < f(x,y) < 1 forall (x,y) € Jj,;. Since (u;_1,u,;) and 
(u;,u,;) are in J;;, and f(u;_1,u;) = 0 while f(u;,u,;) = 1, we find that 


Mig = 0, Me =a when 7 = We 


It follows that 


Chapter 6. Multiple Integrals 362 


Since eal 


we find that 


for all k € Z*, 


k- oo 


Hence, {P;} is an Archimedes sequence of partitions for f. By the 


Arichimedes-Riemann theorem, f : I + Ris Riemann integrable, and 


Seo 
[f= jim 5.) = jm “SE? = 5 


“N 


AA SES 


Figure 6.6: This figure illustrates the different cases considered in Example 6.14 
when k = 8. 


As in the single variable case, there is an equivalent definition for Riemann 
integrability using Riemann sums. 
For a partition P = {xo,21,...,2,} of an interval [a, b], we define the gap of 
the partition P as 
LP) = max {4g — eer |l<21< kb. 
n 


For a closed rectangle I = I] [a;, b;], we replace the length x; —x;_, of an interval 


i=1 
in the partition by the diameter of a rectangle in the partition. Recall that the 


Chapter 6. Multiple Integrals 363 


nm 
diameter of a rectangle J = [ [ke ui] is 
i=1 


diam J = J(u — uz)? +--+ + (Un — Un)?. 


Definition 6.10 Gap of a Partition 


n 


Let P be a partition of the rectangle I = ] [la b;]. Then the gap of the 


j=l 
partition P is defined as 


|P| = max {diam J|J € Jp}. 


Example 6.15 


Find the gap of the partition P = (P,, P2) of the rectangle I = [—2,9] x 
[1, 6] defined in Example 6.1, where P; = {—2,0,4,9} and P, = {1, 3,6}. 


Solution 
The length of the three invervals in the partition P; = {—2,0,4,9} of the 
interval [—2, 9] are 2, 4 and 5 respectively. The lengths of the two intervals 
in the partition P, = {1,3,6} of the interval [1,6] are 2 and 3 respectively. 
Therefore, the diameters of the 6 rectangles in the partition P are 


V2? 4 22. V42 +4 22. V52 + 22, 
V2? +32, V4?24+3?2, V5? + 32. 


From this, we see that the gap of P is V/5? + 3? = 34. 


In the example above, notice that |P,| = 5 and |P,| = 3. In general, it is not 
difficult to see the following. 


Proposition 6.9 


n 


Let P = (P,,..., P,,) be a partition of the closed rectangle I = ] [la: bi] 
= 
Then 


Pl = VIRB +--+ P. 


Chapter 6. Multiple Integrals 364 


The following theorem gives equivalent definitions of Riemann integrability 
of a bounded function. 


Theorem 6.10 Equivalent Definitions for Riemann Integrability 


n 


etl | [la: b;], and let f : I > R be a bounded function defined on I. 


all 
The following three statements are equivalent for saying that f : I > R is 


Riemann integrable. 


(a) The lower integral and the upper integral are the same. Namely, 


f-I 


(b) There exists a number J that satisfies the following. For any < > 0, 
there exists ad > O such that if P is a partition of the rectangle I with 
|P| < 6, then 

|IR(f,P,A)—I|<e 


for any choice of intermediate points A = {&,} for the partition P. 


(c) For any € > 0, there exists a d > O such that if P is a partition of the 
rectangle I with |P| < 6, then 


U(f,P) — L(f,P) <e. 


The most useful definition is in fact the second one in terms of Riemann sums. 


It says that a bounded function f : I > R is Riemann integrable if the limit 


lim R(f,P, A) 


|P|>0 


exists. As a consequence of Theorem 6.10, we have the following. 


Chapter 6. Multiple Integrals 365 


Theorem 6.11 


Let I = ] [la b;], and let f : I  R be a bounded function defined on I. If 
i=1 
f : 1 Ris Riemann integrable, then for any sequence {P;,} of partitions 


of I satisfying 


k- oo 


we have 
@ f f= jim 1,2) = fim U.P 2) 
I —00 —00 


(ii) fi = jim R(f, Px, Ax), where for each k € Z*, A; is a choice of 
I 10,0) 


intermediate points for the partition P,. 


The proof is exactly the same as the single variable case. The contrapositive 
of Theorem 6.11 gives the following. 


Theorem 6.12 


eel ] [la: b;], and let f : I > R be a bounded function defined on I. 
i=1 
Assume that {P;,} is a sequence of partitions of I such that 


k-00 


(a) If for each k € Z", there exists a choice of intermediate points A; for 


the partition P;,, such that the limit jim R(f, Px, Ax) does not exist, 
S00 
then f : I > Ris not Riemann integrable. 


If for each k € Zt, there exist two choices of intermediate points A; 
and B,, for the partition P;, so that the two limits jim R(f, Px, Ay) and 
00 


jim R(f, Px, By) are not the same, then f : I > R is not Riemann 
00 
integrable. 


Theorem 6.12 is useful for justifying that a bounded function is not Riemann 
integrable, without having to compute the lower integral or the upper integral. To 


Chapter 6. Multiple Integrals 366 


apply this theorem, we usually consider the sequence of partitions {P;,,}, where 
P,, is the uniformly regular partition of I into k” rectangles. 


Example 6.16 


Let I = [0,1] x [0, 1], and let f : I > R be the function defined as 


if x is rational, 
f(x,y) = 


if x is irrational. 


Show that f : I — R is not Riemann integrable. 


Solution 


For k € Z*, let P;, be the uniformly regular partition of I into k? rectangles. 


Wise = Ue I ie J = aco yu le a - when 


: 2 : 

0<i<k. Notice that |P;,| = v2 and so lim P; = 0. 
k k- 00 

The partition P;, divides the square I into k* squares J;;, 1 <i < k, 

1 where I — (tent er | hors tes oh since 

irrational numbers are dense, there is an irrational number c; in the interval 

(wi1,ui). Forl<i<k,1<j <k, let a;; and G; ; be the points in J;,; 

given respectively by 


aii (De) Bi; 5 a (ex wig 


Then 

f(@i,;) = 0, f(Bi,;) = Uj. 
ete Ae a Onde {B;;}- Then the Riemann sums R(f, P;,, Ax) 
and R(f, Px, B,) are given respectively by 


k k 


RFPx, An) =» > f(@.5) voll;,) =0, 


i=1 j=l 


Chapter 6. Multiple Integrals 367 


Ch lee Je) = 


Therefore, we find that 


lim Ge Pi Ax) = lim LG 1. En) = 


k-> oo k- oo 


Since the two limits are not the same, we conclude that f : I > R is not 


Riemann integrable. 


Now we return to the proof of Theorem 6.10. To prove this theorem, it is easier 
to show that (a) is equivalent to (c), and (b) is equivalent to (c). We will prove the 
equivalence of (a) and (c). The proof of the equivalence of (b) and (c) is left to the 
exercises. It is a consequence of the inequality 


L(f,P) < Rf,P, A) < U(F,P), 


which holds for any partition P of the rectangle I, and any choice of intermediate 
points A for the partition P. 
By Theorem 6.7, (a) is equivalent to 


(a’) For every ¢ > 0, there is a partition P of I such that 
U(f,P) =L Ee) <€ 


Thus, to prove the equivalence of (a) and (c), it is sufficient to show the equivalence 
of (a’) and (c). But then (c) implies (a’) is obvious. Hence, we are left with the 
most technical part, which is the proof of (a’) implies (c). 

We formulate this as a standalone theorem. 


Chapter 6. Multiple Integrals 368 


Theorem 6.13 


n 


Let I = ] [la b;], and let Po be a fixed a partition of I. Given that f : I > 


i=l 
R is a bounded function defined on I, for any « > 0, there is a d > O such 
that for all partitions P of I, if |P| < 6, then 


If Theorem 6.13 is proved, we can show that (a’) implies (c) in Theorem 
6.10 as follows. Given e > 0, (a’) implies that we can choose a Pg such 
that 


UG, Po) — L(f, Po) =< =. 


By Theorem 6.13, there is ad > O such that for all partitions P of I, if 
P| < op then 


U(f,P) — L(f,P) < U(f,Po) — Lf, Po) +5 <e. 


This proves that (a’) implies (c). 


Hence, it remains for us to prove theorem 6.13. Let us introduce some additional 
notations. Given the rectangle I = ] [la bi], for 1 <i <n, let 
i=1 
vol (I) 
by — a; (6.2) 


= (bd; = a1) Xr & (bj_1 = a;—1)(bi41 = 41) ne (by, = An). 


This is the area of the bounday of I that is contained in the hyperplane x; = a; or 
x; = b;. For example, when n = 2, I = [aj, bi] x [ae, be], S; = by — az is the 
length of the vertical side, while S, = b; — a, is the length of the horizontal side 
of the rectangle I. 


Chapter 6. Multiple Integrals 369 


Proof of Theorem 6.13 
Since f : I > R is bounded, there is a positive number M such that 


lf (x)| <M for all x € I. 


Assume that Pp = (Pi, ...,P,). For 1 <i <n, let k; be the number of 
intervals in the partition P.. Let 


K= max{ky, pea iets 


and 
§ =O) be, 


where S;, 1 <2 < n are defined by (6.2). Given e > 0, let 


€ 
= MKS: 

Then 6 > 0. If P = (P,,..., P,) isa partition of I with |P| < 6, we want to 
show that (6.1) holds. Let P* = (Py,..., P*) be the common refinement of 
P, and P such that P* is the partition of [a;, b;| that contains all the partition 
points of P, and P;. For 1 <i < n, let U; be the collection of intervals in 
P; which contain partition points of P,, and let V; be the collection of the 
intervals of P; that is not in U;. Each interval in V; must be in the interior 
of one of the intervals in P, Thus, each interval in V; is an interval in the 
partition P*. Since each partition point of P, can be contained in at most 
two intervals of P;, but the first and last partition points of P; and P. are the 
same, we find that |U;| < 2k;. 

Since |P;| < |P| < 6, each interval in P; has length less than 5. Therefore, 
the sum of the lengths of the intervals in U; is less than 2k;6. Let 


Q; = {J € Jp | the i"-edge of J is from U;}. 


S> vol (J) < 2k55; < 2K5S;. 
JEQ; 


Chapter 6. Multiple Integrals 370 


Figure 6.7: The partitions Po and P in the proof of Theorem 6.13, Po is the 
partition with red grids, while P is the partition with blue grids. Those shaded 
rectangles are rectangles in P that contain partition points of Po. 


YS" vol (J) < 2K5 S > S, = 2K6S. 
JEQ ie 


For each of the rectangles J that is in QO, we do a simple estimate 
My — My < 2M. 


Therefore, 

S = (My — ms) vol (J) < AMKOS < . 

JeQ 
For the rectangles J that are in Zp \ Q, each of them is a rectangle in the 
partition P*. Therefore, 


> (My — my) vol (J) < U(f,P*)-L(f,P*) < UC, Po) LCF, Po). 
JETp\2Q 


Chapter 6. Multiple Integrals 371 


= S© (My — mj) vol (J) 


JETp 


> (My — my) vol (J) + S> (My — mj) vol (J) 


JETp\2 JeEQ 
TUNG IEG) lb Beep) ae 


This completes the proof. 


Finally we extend Riemann integrals to functions f : © — R that are defined 


on bounded subsets D of R”. If is bounded, there is a positive number L such 
that 


|x|| < D for all x € D. 


n 


This implies that © is contained in the closed rectangle I, = [[[-2: L]. To 


i=1 
define the Riemann integral of f : © — R, we need to extend the domain of f 


from D to I,. To avoid affecting the integral, we should extend by zero. 


Definition 6.11 Zero Extension 


Let 9 be a subset of IR”, and let f : © — R be a function defined on D. 
The zero extension of f : D — R is the function f : R” — R which is 
defined as 


ifx € 9D, 
ifx ¢ 9D. 


If U/ is any subset of R” that contains D, then the zero extension of f to U 
is the function f :U > R. 


Obviously, if f : D > R is a bounded function, its zero extension f : R" > 


R is also bounded. Since we have defined Riemann integrability for a bounded 


function g : I — R that is defined on a closed rectangle I, it is natural to say that 


a function f : D — R is Riemann integrable if its zero extension f : I > Rtoa 


closed rectangle I is Riemann integrable, and define 


bp 


Chapter 6. Multiple Integrals 312 


For this to be unambiguous, we have to check that if I, and I, are closed rectangles 


that contain the bounded set , the zero extension f : I, — R is Riemann 


integrable if and only if the zero extension f : I, > R is Riemann integrable. 


Moreover, 


[ f= i 


This small technicality would be proved in Section 6.2. Assuming this, we can 


give the following formal definition for Riemann integrality of a bounded function 


defined on a bounded domain. 


Definition 6.12 Riemann Integrals of General Functions 


n 


Let D be a bounded subset of R”, and let I = ] [la bi] be a closed 


d=il 


rectangle in IR” that contains . Given that f : © — R is a bounded 


function defined on D, we say that f : © — R is Riemann integrable if 


its zero extension f : I > R is Riemann integrable. If this is the case, we 


define the integral of f over D as 


b-fi 


Example 6.17 


Let I = [0,1] x [0,1], and let f : I > R be the function defined as 


ike ita SY. 
f(z,y) = 
0, Wee 


which is considered in Example 6.14. Let 


and letg : D > 


D={(@,y) €lly<s}, 


R be the constant function g(x) = 1. Then f : I > 


the zero extension of g to the square I that contains 9. 


Chapter 6. Multiple Integrals 373 


In Example 6.14, we have shown that f : I > R is Riemann integrable and 


Therefore, g: 9 — | 


Remark 6.1 


Here we make two remarks about the Riemann integrals. 


he 


When f : D — R is the constant function, we should expect that it 
is Riemann integrable if and only if has a volume, which should be 


defined as 
vyol(2) = it dx. 
D 


. If f : 9 — R is a nonnegative continuous function defined on the 


bounded set D that has a volume, we would expect that f : 9 — Ris 


Riemann integrable, and the integral | f(x)dx gives the volume of the 


D 
solid bounded between D and the graph of /. 


In Section 6.3, we will give a characterization of sets D that have volumes. 


We will also prove that if f : © — R is a continuous function defined on a set D 


that has volume, then f : ® — R is Riemann integrable. 


Chapter 6. Multiple Integrals 374 


Exercises 6.1 


Question 1 


Question 2 


Let I = [—5,8] x [2,5], and let f : I — R be the function defined as 
f(x,y) = x? + 2y. Consider the partition P = (P,, P2) of I with P, = 
{—5, —1,2,7,8} and P, = {2,4,5}. Find the Darboux lower sum L(f, P) 
and the Darboux upper sum U(f, P). 


Question 3 


Let I = [—5,8] x [2,5], and let f : I > R be the function defined as 
f(x,y) = x? + 2y. Consider the partition P = (P;, P2) of I with P, = 
{—5, —1,2,7,8} and P, = {2,4,5}. For each rectangle J = [a,b] x [c,d] 
in the partition P, let ay = (a,c) and 3, = (b, d). Find the Riemann sums 
R(f,P, A) and R(f,P, B), where A = {ay} and B = {G5}. 


Question 4 


Let I = [—1, 1] x [2,5], and let f : I > R be the function defined as 


ie if x and y are rational, 
f(@,y) = 
0, otherwise. 
(a) Given that P is a partition of I, find the Darboux lower sum L(f, P) 
and the Darboux upper sum U(f, P). 


(b) Find the lower integral | f and the upper integral { fe 
JI I 


(c) Explain why f : I > Ris not Riemann integrable. 


Chapter 6. Multiple Integrals 313 


Question 5 


Let I = [0, 4] x [0,2]. Consider the function f : I > R defined as 
f(x,y) = 2 + 3y +1. 


For k € Z*, let P;, be the uniformly regular partition of I = [0, 4] x [0, 2] 
into k? rectangles. 


(a) For each k € Z*, compute the Darboux lower sum L(f,P;) and the 
Darboux upper sum U(f, P;,). 


(b) Show that f : I — R is Riemann integrable and find the integral | Ae 
I 


Question 6 


n 


Let I = ] [la:: b;], and let f : I + R be a function defined on I. Show that 


a= 
the following are equivalent. 


(a) There exists a number J that satisfies the following. For any ¢ > 0, 
there exists a d > 0 such that if P is a partition of the rectangle I with 
|P| < 6, then 

|R(f,P,A)-I|<e 


for any choice of intermediate points A = {€,} for the partition P. 


(b) For any ¢ > 0, there exists a d > 0 such that if P is a partition of the 
rectangle I with |P| < 6, then 


U(f,P) — L(f,P) <e. 


Chapter 6. Multiple Integrals 376 


6.2 Properties of Riemann Integrals 


In this section, we discuss properties of Riemann integrals. Let us first consider 


Riemann integrals of functions f : I — R defined on closed rectangles of the form 


LS ice bi]. Using some of these properties, we prove that the definition of 


i=1 
Riemann integrabililty for functions f : — R defined on general bounded sets, 
as given in Section 6.1, is unambiguous. Finally, we will extend the properties of 


Riemann integrals to functions f : ® — R defined on bounded sets. 
Linearity is one of the most important properties. For functions defined on 
n 


closed rectangles of the form I = ] [la. bi], the proof is straightforward using the 
i=1 
Riemann sum definition of Riemann integrability, as in the single variable case. 


Theorem 6.14 Linearity 


Let I = ice b;], and let f : I> Rand g : I > R be Riemann integrable 
= 


functions. For any real numbers a and 3, (af + 6g) : I > R is also 


Riemann integrable, and 


for+sn=0ft+e fs 


Sketch of Proof 
If P is a partition of I and A is a set of intermediate points for P, then 


The results follows by taking the |P| — 0 limit. 


Example 6.18 


Let I = [0,2] x [0,2], and let f : I — Rand g : I > R be Riemann 
integrable functions. Find the integrals | f and | g if 
I I 


f(z,y)=9ly,v) and (f+g)(x,y) =6 for all (x,y) € I. 


Chapter 6. Multiple Integrals 377 


Solution 


Since I is symmetric with respect to the line y = x and f(z, y) = g(y,2) 


for all (x,y) € I, we have 1G = [> By linearity, 
I I 


[a+ fo= [+9 =6x vr = 
[r= [o-v. 


The following theorem is about the integral of a nonnegative function. 


Theorem 6.15 


n 


Pe ] [la: b;], and let f : I > R be a bounded function defined on I. 


i=1 
Assume that f(x) > 0 for all x in I. If f : I + R is Riemann integrable, 
then 
jeu 
I 


For any partition P of I, L(f,P) > 0. Therefore, 


The monotonicity theorem then follows from linearity and Theorem 6.15. 


Theorem 6.16 Monotonicity 


n 


cen | [la: b;], and let f : I Rand g : I > R be Riemann integrable 


i=1 
functions. If f(x) > g(x) for all x in I, then 


[ef 


Chapter 6. Multiple Integrals 378 


By linearity, the function (f — g) : I > R is integrable, and 


fu-o = [+- fo 


By Theorem 6.15, fu 
I 


— g) > 0, and the assertion follows. 


The next important property is the additivity of the Riemann integrals. 


Theorem 6.17 Additivity 


n 


eul— ] [la: b;|, and let Po be a partition of I. If f : I + Ris a bounded 


i=1 
function defined on I, then f : I Ris Riemann integrable if and only if 


foreach J € Jp,, f : J 
have 


— R is Riemann integrable. In such case, we also 


= 


JETP, 


It is sufficient to consider the case that Pp = (P,,..., P,,) divides I into 


two rectangles I, and I, by having a partition point c inside the j'"-edge 


a;,b;| for some 1 < 7 < n. Namely, P; = {a;,c,b;}, and fori # j, 
JOA d J J 


P, = {a;,b;}. The general case can be proved by induction, adding one 


partition point at a time. 


Assume that f : I > | 


IR is Riemann integrable. Given ¢ > 0, there is a 


partition P of I such that 


U(f,P) — L(f,P) <e. 


Let P* be acommon refinement of P and Py. Then 


UGer hr) = CNG dep) bie) <E. 


Chapter 6. Multiple Integrals 319 


But P* induces a partition P*(I,) and P*(I,) of I, and Iz, and we have 


U(f,P*) = UF, Ph) + UF, P*(b)), 


L(f,P*) = L(f, P*(h)) + Lf, P*(hh)). 


Therefore, 
U(f, P*(h)) — £9, P*(h)) + UG, P*()) — LG, P*(h)) <e. 
This implies that 


Cig Pea yr Palle HG el 


Hence, f : I, — Rand f : In — R are Riemann integrable. 


Conversely, assume that f : I, — Rand f : I, — R are Riemann 


integrable. Let {P,,} and {P2;,} be Archimedes sequences of partitions 


for f : I, — Rand f : I, > R respectively. Then 


[f= fm UU.Pi) = Jim 1U,Py) for =1,2. 
L 0° 00 


For k € Z*, let Pj be the partition of I obtained by taking unions of 
partition points in P,, and P2;. Then P, , = Pj (I,) and Py, = Pj (Iz). 
It follows that 


Ce P;) a Ce ie einer Pox), L(f, P;) = LAGE Pix )+L(f, Pox). 


Therefore, 


k-oo 


Hence, {P;} is an Archimedes sequence of partitions for f : I > R. This 


shows that f : I Ris Riemann integrable, and 


[f= fm UU.PD = im WLP) + UE Pow) = f r+ fF 
I 00 00 Ll ip 


Next we state a lemma which is useful. 


Chapter 6. Multiple Integrals 380 


Figure 6.8: A partition P of I and the refined partition P* that induces partitions 
on I, and Ip. 


Lemma 6.18 


n 


Given that I = [ [la b;] is a closed rectangle in R”, let 
i=1 


1 
w= sminib; — a;|1 <i <n}. 


n 


(i) Given 7 > 0, let I, be the closed rectangle I, = [[la — ,b; +n]. 
i=l 
For any <« > 0, there exists 6 > 0 such that for any 0 < 7 < 6, 
0 < vol (I,) — vol (I) < . 


n 


(ii) Given 0 < k <w, let I, be the closed rectangle I,, = ] [laite. bi—k]. 


4=1 
For any € > 0, there exists 0 < 6 < w such that for any 0 < k < 6, 


0 < vol (I) — vol (I,,) < ¢. 


Figure 6.9: Enlarging or shrinking a rectangle by an arbitrary amount. 


Chapter 6. Multiple Integrals 381 


This lemma says that one can enlarge or shrink a rectangle by an arbitrarily 
small amount. It can be proved by elementary means. But here we use some 
analysis technique to prove it. 


We prove part (i). The argument for part (ii) is the same. Consider the 
function h : [0, co) — R defined by 


n 


h(n) = vol (I,) = | [(b: — a: + 2n). 


=1 


As a function of 7, h(7) is a polynomial, and it is a strictly increasing 
continuous function. The assertion is basically the definition of the limit 
lim h(7) = h(0). 
ia a) = sD) 


The following theorem says that a bounded function f : I — R which is 


identically zero on the interior of I is Riemann integrable with integral 0. This is 
something we would have expected. 


Theorem 6.19 


n 


er | [la b;], and let f : I + R be a bounded function such that 


tll 


co for all x € int (I). 


Then f : I > Ris Riemann integrable and 


[s=0 


1 
Let w = i min{b; — a;|1 <i <n}. Since f : I > R is bounded, there is 
a positive number MV such that 


lf(x)| <M forall x € I. 


Chapter 6. Multiple Integrals 382 


By Lemma 6.18, there is a & € (0,w) such that 


S 
1(1) — vol (1,,) < —, 
vol (I) — vol (I,.) M 


n 


where I,, = [ [la +, b; — &]. It is a rectangle that is contained in int (I). 


Let P = (P,,.. ., P,) be the partition of I with P; = {a;, a; +4, b; —K, b;}. 
Then I,, is one of the rectangles in the partition P. On J = I, f(x) = 0, 
and so My = my = 0. For all other rectangles J in 7p, we use the crude 
estimate 

SWS ng SM SI 


Then 


L(f,P)= S> msvol(J)= SY > myvol(J) 
JETp JeJp\{Ix} 
> —M (vol (I) — vol (1,,)) > —e. 


In the same way, we find that U(f,P) < ¢. Since « > 0 is arbitrary, we 


find that — 
ft >0 and [1 <0. 
JI I 


Since if es ‘A f, we conclude that 
ah i 


[1 [r-0 


This proves that f : I > Ris Riemann integrable and 


[s=0 


Now let us give a proof that the definition given in Section 6.1 for a bounded 


function f : © — R defined on a bounded subset of IR” to be Riemann integrable 


is unambiguous. The crucial point is the following. 


Chapter 6. Multiple Integrals 383 


Lemma 6.20 


n 


Let I = ] [la b;] and I= ] [la bj] be closed rectangles in R” such that 
=1 


I D 
I c I, and let f : I > R be a bounded function defined on I. Then 
f : 1— Ris Riemann integrable if and only if its zero extension f : I > R 


is Riemann integrable. In such case, we also have 


[rf 


Let P = {P,, ae jee be the partition of I such that the set P, is the set 
that contains d;, a;, b;, b,. 


For each rectangle J in Zp \ {I}, it is disjoint from the interior of I. Hence, 


f vanishes in the interior of J. By Theorem 6.19, f : J > Ris Riemann 


integrable and | f =0. It follows from the additivity theorem that f : I > 
J - 
IR is Riemann integrable if and only if f : I + R is Riemann integrable, 


fi-f) 


However, restricted to I, f(x) = f(x). Hence, f : I > R is Riemann 


and 


integrable if and only if f : I — Ris Riemann integrable. In such case, we 


[BaF 


have 


Figure 6.10: The rectangle I is contained in the rectangle I. 


Chapter 6. Multiple Integrals 384 


Finally we can prove the main result. 


Theorem 6.21 


Let D be a bounded set in R”, and let f : ® — R be a bounded function 
defined on 9. The definition for Riemann integrability of f : OD — R is 


unambiguous. Namely, if I, = [ [la bi] and I, = ] [la’ , 0!| contain D, 
i=1 i=1 


the zero extension f : I, > Ris Riemann integrable if and only if the zero 


extension f : I, > R is Riemann integrable. In the latter case, 


[ f= i 


and so we can define unambiguously 


[r- fi 


where I is any rectangle of the form ] [la b;] that contains D. 


i=1 


Let I = I, NIy. Then Lis a rectangle that is contained in I, and I,. Lemma 
6.20 then says that f : I; + R is Riemann integrable if and only if f : I > 
R is Riemann integrable, if and only if f : I; + R is Riemann integrable. 


[ f= [F- | f 


In latter case, 


Figure 6.11: The set 9 is contained in the rectangles I, and Iy. 


Chapter 6. Multiple Integrals 385 


Now we can extend the linearity and monotonicity to Riemann integrals over 


any bounded domains. 


Theorem 6.22 Linearity 


Let © be bounded subset of R”, and let f : 9 — Randg: D > Rbe 
Riemann integrable functions. For any real numbers a and 3, (af + Gg) : 


® — Ris also Riemann integrable, and 


flot+s=0 f r+a fs. 


n 


er — ] [la b;] be a closed rectangle that contains D, and let f : I > R 
al 
and g : I > R be the zero extensions of f : 9 —~ Randg: D9 > R 


to I. It is easy to check that (af + 89) : I > R is the zero extension of 
(af +6g):D — Rtol. Since f : D — Rand g : D — R are Riemann 
integrable, f : I + Rand g : I > R are Riemann integrable and 


hin hh [2 hs 


By Theorem 6.14, (af + 8g) : I + R is Riemann integrable, and 


foi+on-0 fi+6 fa-af sro fo 


It follows that (af + Gg) :D — R is also Riemann integrable, and 


[tot+s0= flof+an=0 f r+a fo 


Chapter 6. Multiple Integrals 386 


Theorem 6.23 


Let © be a bounded subset of IR”, and let f : © — R be a bounded 
function defined on ©. Assume that f(x) > 0 for allxinD. If f:O9 >R 
is Riemann integrable, then 


free. 


n 


et | [la. b;] be a closed rectangle that contains D, and let f : I > R 


i=1 
be the zero extension of f : 9 — RtoTI. Since f : D — Ris Riemann 


integrable, f : I > R is also Riemann integrable. It is easy to check that 


f(x) = 0 for all x in I. Therefore, 


[r= [20 


As before, monotonicity is a consequence of linearity and Theorem 6.23. 


Theorem 6.24 Monotonicity 


Let D be a bounded subset of IR", and let f : 9 > Randg:D > 
Riemann integrable functions. If f(x) > g(x) for all x in D, then 


[oe fs 


At the end of this section, we want to present two theorems whose proofs are 


almost verbatim those for the n = 1 case. The first theorem says that if a function 


is Riemann integrable, so is its absolute value. 


Theorem 6.25 Absolute Value of Riemann Integrable Functions 


Let D be a bounded subset of IR”, and let f : © — R be a bounded function 
defined on D. If the function f : ® — R is Riemann integrable, then the 


function |f| : © — R is also Riemann integrable. 


Chapter 6. Multiple Integrals 387 


Sketch of Proof 


R is the zero extension of f : 9 — R to the closed rectangle 


[a;, b;] that contains D, then || : I — R is the zero extension of 


R. Hence, it is sufficient to consider the case where D is a 


n 


closed rectangle of the form I = | [la b;|. The proof is almost the same 


i=1 
as the n = 1 case. The key of the proof is the fact that for any subset A of 
I, 
— inf = — inf ; 
SMOG 9) at pees [esioteles alae) 


The second theorem says that products of Riemann integrable functions are 
Riemann integrable. 


Theorem 6.26 Products of Riemann Integrable Functions 


Let D be a bounded subset of R”, and let f : D9 —~ Randg: D> R 
be bounded functions defined on 9. If the functions f : 9 — R and 
g : D — Rare Riemann integrable, then the function (fg) : D — Ris also 


Riemann integrable. 


Sketch of Proof 
It is sufficient to consider the case where ‘D is a closed rectangle of the form 


n 


T= ] [la b;]. The proof is almost the same as the n = 1 case. The key of 


i=1 
the proof is the fact that if // is positive number such that 


f(x)| <M and |g(x)| <M for all x € I, 


then for any subset A of I, 
sup(fg)(x) — inf (fg)(x) 


xEA xed 


< M (sup f(0) — inf, f(x) + sup g(x) — inf g(x) ). 


xEA xEA 


Chapter 6. Multiple Integrals 388 


Exercises 6.2 


Question 1 


Let I = [0,3] x [0,3], and let f : I > Rand g : I — R be Riemann 
integrable functions. Suppose that 


f(x,y) =9(y,z) and (8f+2g9)(z,y)=10 _ forall (x,y) €I, 


find ff and fs. 
I I 


Question 2 


Complete the details in the proof of Theorem 6.25. 


Question 3 


Complete the details in the proof of Theorem 6.26. 


Chapter 6. Multiple Integrals 389 


6.3. Jordan Measurable Sets and Riemann Integrable Functions 


In this section, we will give some sufficient conditions for a bounded function 


f:9-> 


R to be Riemann integrable. We start with the following theorem. 


Theorem 6.27 


n 


Perl — ] [la b;], and let f : I + R be a continuous function defined on 


i=1 
I. Then f : I — Ris Riemann integrable. 


Since f : I + R is continuous and I is compact, f : I > R is uniformly 


continuous. Given € > 0, there exists 6 > 0 such that if u and v are points 


in I and ||u — v|| < 6, then 


S 


|< eo) 


Let P be any partition of I with |P| < 6. A rectangle J in Jp is a compact 


set. Since f : J — R is continuous, the extreme value theorem says that 


there exist points uy and vy in J such that 


ly ey Xe vy) for all x € J. 


Therefore, 


my = inf f(x) = f(ug) and Mj = sup f(x) = (vs). 


xeJ xed 


Since |P| < 6, 


lus — vy|| < diamJ < |P| < 0. 


Therefore, 


My — my = f (vs) — f(us) < Soni 


This implies that 


U(f,P)-L(f,P) = > (Ma — ms) vol (J) < — 
JEIJp 


Hence, f : I > R is Riemann integrable. 


Chapter 6. Multiple Integrals 390 


Example 6.19 


Let f : [0,1] x [0, 1] > R be the function defined as 


f(x,y) = sin(xy). 


This is a composition of the sine function and a polynomial, both of which 


are continuous functions. Hence, f : [0,1] x [0,1] — R is a continuous 


function. Therefore, f : [0,1] x [0,1] + Ris Riemann integrable. 


In Section 6.1, we have seen that a constant function f : I > R, f(x) =c 


defined on I = ] [la b;] is Riemann integrable and its integral is 
i=1 


[i = cvol (I). 


Since constant functions are the simplest bounded functions, it is natural to ask 


whether a constant function f : D — R, f(x) = c ona bounded set D is always 
Riemann integrable. By linearity, it is sufficient to consider the case when c = 1. 


n 


When 9 is a closed rectangle of the form I = [ [la b;], the answer is affirmative 


i=1 


| dx = vol (I) 


To consider a general set 9, let us first define the characteristic function of a 


and we have 


set. 


Definition 6.13 Characteristic Functions 


Let A be a subset of IR”. The characteristic function or indicator function 
of the set A is the function y 4 : R” — R defined as 


ifx € A, 
ifx ¢ A. 


Chapter 6. Multiple Integrals a1. 


Example 6.20 


Let A = {(z, y) |x > 0}. Notice that the function y 4 : ] 


bine S10) 
Nive S10) 


ALE; y) a 


It is continuous at (x, y) if and only if z > 0 or x < 0. 


Figure 6.12: The set A = {(x, y) |x > O}. 


Interior, Exterior and Boundary of a Set 


In Chapter 1, we have seen that if A is a subset of R”, then R” is a disjoint 
union of int A, ext A and OA. 

If xo is a point in R”, xo € int A if and only if there is an r > O such 
that B(xo,r) C A; Xo © ext A if and only if there is an r > O such that 
B(xo,r) C R” \ A; and xo € OA if for every r > 0, B(xo,1) contains a 


point in A and a point not in A. 


Theorem 6.28 


Let A be a subset of IR”, and let v4 : R” — R be the characteristic function 
of A. Then the set of discontinuities of the function v4 is the set 0A. 


Chapter 6. Multiple Integrals 392 


Since R” is a disjoint union of int A, ext A and OA, we will show that y 4 is 
continuous on int A and ext A, and discontinuous at every point in OA. 
The sets int A and ext A are open sets, and f is equal to 1 on int A and 0 
on ext A. For every Xo in int A, there is an r > 0 such that B(xo,7r) C A. 
Therefore, for any € > 0, if x is such that ||x — xo|| <r, then 


Cal = Gie)| = = & 


This shows that f is continuous at xg. Similarly, if xo is in ext A, there is 


an r > 0 such that B(xo,r) C R” \ A. The same reasoning shows that f is 
continuous at Xo. 

Now consider a point Xo that is in 0A. For any k € Z”, there is a point 
ux, € Aanda point v; ¢ A such that u, and v; are in the neighbourhood 
B(xo, 1/k) of xo. The two sequences {u,} and {v;,} both converge to xo, 


but the sequence { f(u,,)} converges to 1, the sequence { f(v;,)} converges 


to 0. This shows that f is not continuous at xo. 


Figure 6.13: The characteristic function of a set A is not continuous at xo if and 
only if xp € OA. 


By definition, restricted to the set A, x4 : A — R is the constant function 


x.4(A) = 1. Now we define Jordan measurable sets and its volume. 


Chapter 6. Multiple Integrals 393 


Definition 6.14 Jordan Measurable Sets and Volume 


Let D be a bounded subset of IR”. We say that © is Jordan measurable if 
the constant function yp : D — Ris Riemann integrable. In this case, we 


define the volume of D as 


vol (2) = f xo= ff ae. 


Example 6.21 


The closed rectangle I = ] [la:: bj] is Jordan measurable, and its volume is 
i=) 


vol (I) = / (<= [It = 2,); 


as what we have defined earlier. 


Example 6.22 


Example 6.14 says that the set 
D—{(e,y)\0sysu=1} 


is Jordan measurable and vol (D) 


Figure 6.14: The set D = {(x, y)|0 < y < x < 1} is Jordan measurable. 


Chapter 6. Multiple Integrals 394 


One might think that all bounded subsets of R” has volumes. This is not true. 
An example is given below. 


Example 6.23 


Let I = (0, 1]” and let 


D={xelI|xeQ’}. 


Notice that © is a subset of the rectangle I, and the zero extension of xo : 
®D — Ris the function yp : I > R, 


if all components of x are rational, 


otherwise, 


which is the Dirichlet’s function. We have seen in Example 6.13 that the 
function yy : I > R is not Riemann integrable. Hence, x5 : D — R is not 


Riemann integrable. This means the set D is not Jordan measurable and so 
it does not have a volume. 


This example also shows that if B is a subset of A, and the function f : A > 


IR is Riemann integrable, the function f : B — R is not necessary Riemann 
integrable. 
The next example says that the boundary of a rectangle has volume 0. 


Example 6.24 


nm 


Let I = ] [la. bi], and let 9 = OI. Notice that D is contained in I. The 
2 
zero extension of ¥» : 9 — R is the function yg : I — R which vanishes 


on the interior of I. By Theorem 6.19, ym : I > R is Riemann integrable 


and / Ya = 0. Therefore, = OI has zero volume. 
I 


Chapter 6. Multiple Integrals 395 


Remark 6.2 Darboux Sums for a Characteristic Function 


Given a bounded set D that is contained in the rectangle I, if P is a partition 
of I, L(ya,P) is the sum of the volumes of the rectangles in P that is 


contained in D; while U(x, P) is the sum of the volumes of the rectangles 


in P that intersect D. See Figure 6.15. 
Thus, for D to have volume, the two numbers L(x5,P) and U(x, P) 
should get closer and closer when the partitions P gets finer. 


Figure 6.15: The geometric quantities represented by L(y5, P) and U(x, P) 
when ® is the region bounded inside the circle. 


Our goal is to give characterization of sets that are Jordan measurable. We will 
consider those that have zero volumes first. The following is a useful lemma. 


Lemma 6.29 


Let I be a closed rectangle in R” that contains the closed rectangles 
I,,...,1,. There is a partition P of I such that if J is a rectangle in the 


partition P, then J is either contained in an I; for some 1 < j < k, or J is 


disjoint from the interiors of I; for all 1 <j <k. 


Chapter 6. Multiple Integrals 396 


Sketch of Proof 
We construct the partition P = (P,,..., P,,) in the following way. For each 


1 <i <n, the partition points in P; is the set of end points of the i'*-edge 


of I, ly, ..., I. One can check that this partition satisfies the requirement. 
See Figure 6.16 for an illustration. 


Figure 6.16: A partition of the rectangle I that satisfies the conditions in Lemma 
6.29. 


Let us introduce the definition of a cube. 


Definition 6.15 Cubes 


A rectangle of the form ] [la b;| such that 


t=1 


b) — a, = bp -Q9 =:::=b), -—a, =l=2r 


is called a (closed) cube with side length ? = 2r. The center of the cube is 


(a az + be ote) 


C= eyes 


a ee 


We will denote such a cube by Q..,. 


There are also cubes whose edges are not parallel to the coordinate axes. In 
this chapter, when we say a cube, we always mean a cube defined above. 
Now we can give a characterization of sets with zero volume. 


Chapter 6. Multiple Integrals 397 


Theorem 6.30 


Let D be a bounded subset of R”. The following are equivalent. 


(a) The set D is Jordan measurable and it has zero volume. 


(b) For any e > 0, there are finitely many closed cubes Q,...,@Q, such 
that 


k k 
DG Ja; and S- vol (Qi) 


j=l j=l 


(c) For any ¢ > 0, there are finitely many closed rectangles I,,..., I, such 
that 


k k 
DG U I; and S_ vol (la 
j=l j=l 


First assume that D is a Jordan measurable set with zero volume. There is a 
positive number RF such that the closed cube Qo,r = [—R, R]” contains the 


set D. Let I = Qo,r. Then the function yp : I — R is Riemann integrable 


and | ya = 0. Given m € Z*, let P,,, be the uniformly regular partition 


I 
of I into m” rectangles. Notice that each rectangle in the partition P,,, is a 
cube. Since lim |P,,,| = 0, we have 

mM—-> Co 


lim U(x, Pm) = fee =0) 
(te I 
Given < > 0, there is a positive integer / such that for allm > M, 
Ue, Pn) ce: 
Consider the partition P,,. Notice that for J € Zp,,, 


ifJINDF9, 
iS — 


Chapter 6. Multiple Integrals 398 


A ={J € Jp,,|IND FO}. 


Wane i — S— vol (J) 
JEL 


@ is a finite collection of cubess. Hence, we can named the cubes in .@ as 
Q1,.--,Q,. By construction, 


k k 
ie Ja; and S| vol (Q;) <eé 
jel ll 

This proves that (a) implies (b). 

(b) implies (c) is obvious since a cube is a rectangle. 

Now assume that (c) holds. Given ¢ > 0, (c) says that there are closed 
rectangles I,,..., 1, such that 


k 
ae U1 and Yvoily 
ell 


By Lemma 6.18, for each 1 < j < k, there is a closed rectangle I, such 
that I; C int I, and 
vol (I;) 


It follows that 


k k 
ae | int I, and S_ vol (I;) <e 
j=l 


j=l 


k 


Let I be a closed rectangle whose interior contains the bounded set U le 
j=l 


Chapter 6. Multiple Integrals 399 


By Lemma 6.29, there is a partition P of I such that each rectangle J in the 
partition P is either contained in an I, for some 1 < j < k, or is disjoint 
from the interiors of I, for all 1 <j <k. Let 


B={IE Jo|I cI; forsome1 <j <k}. 


If J ¢ Z, then JN int I, = ( for all 1 < 7 < k. Therefore, JM D = Q). For 
these J, M3(x2) = m3(xmp) = 0. If J is in Z, we use the simple estimate 
My; < 1. Thus, 


U(xa,P) = S— Myvol (J) < S— vol (J) < $— vol (Ij) <e. 


JEB JEB qe 


Since L(x, P) > 0, we find that 


U(xo, P) _ L(xa, P) <€. 


This proves that v5 : I — R is Riemann integrable. Since we have shown 
that there exists a partition P such that U(x, P) < €, we have 


vor 2) — pe SGU Pe. 
I 


Since € > 0 is arbitrary, we find that vol (©) = 0. This completes the proof 
of (c) implies (a). 


Motivated by Theorem 6.30, we make the following definition. 


Definition 6.16 Jordan Content Zero 


Let D be a bounded subset of R”. We say that 9 has Jordan content 
zero provided that for any « > 0, there are finitely many closed rectangles 
I,,..., 1, such that 


Chapter 6. Multiple Integrals 400 


Sets that have Jordan Content Zero 
Let D be a bounded subset of R”. Theorem 6.30 says that D is Jordan 


measurable with volume zero if and only if it has Jordan content zero. 


The characterization of sets with zero volume given in Theorem 6.30 facilitates 
the proofs of properties of such sets. 


Theorem 6.31 


Let , and D» be bounded subsets of R”. If ©, has Jordan content zero 
and D» C Dj, then Dz also has Jordan content zero. 


Given € > 0, since 9, has Jordan content zero, there are closed rectangles 
I,,...,1, such that 


k k 
yi U I, and S— vol (lie. 
j=l j=l 


Since Dy C Dj, we find that 


k k 
aS U I; and S_ vol (ee 
j=l j=l 


Therefore, D» also has Jordan content zero. 


Example 6.25 


Let D be the subset of R® given by 
D ={(z,y,2)| —2<2<3,-5<y<7}. 


Show that D is a Jordan measurable set with zero volume. 


Chapter 6. Multiple Integrals 401 


Solution 
Let I = [—2,3] x [—5, 7] x [2,3]. Then I is a closed rectangle in R®. 
Example 6.24 says that OI has Jordan content zero. Since D Cc OI, 


Theorem 6.31 says that D has Jordan content zero. Hence, 9 is a Jordan 
measurable set with zero volume. 


The next theorem concerns unions and intersections of sets of Jordan content 


zero. 
Theorem 6.32 


(a) If = {D,|a € J} is a collection of sets that have Jordan content 
zero, then their intersection U/ = () ,, also has Jordan content zero. 


acd 


(b) If Di, ...,,, are finitely many sets that have Jordan content zero, then 


their union D = iy Q, is also a set that has Jordan content zero. 
j=l 


(a) is obvious since U C Dy, for anya € J. 

(b) is basically a consequence of the fact that finite union of finite sets is 
finite. Given « > 0, foreach 1 < 7 < m, since 9, has Jordan content zero, 
there is a finite collection 4; = {Ig, | 3; € Jj} of closed rectangles such 


that 
Dy Cc U Iz,, Ss vol (Iz, ) << ee 
Bj EJ; Bj EJ; ue 


Since each 4;, 1 < 7 < mis finite, Z is also a finite collection of closed 
rectangles. Moreover, 


M=|Ja,el, Ute = J Is, 
j=l 


J=1 Bj EJ; IgcB 


Chapter 6. Multiple Integrals 


402 


Se vol (Iz) < 3 Ss vol (Ig,) <e. 


IgcB 


ji B; EJ; 


This shows that D has Jordan content zero. 


Example 6.26 


It is obvious that a one-point subset of IR” has Jordan content zero. It 


follows that any finite subset of | 


IR” has Jordan content zero. 


Now we want to consider general Jordan measurable sets. We first prove the 


following two theorems, giving more examples of Riemann integrable functions. 


The first one is a special case of the second one, but we need to prove it first to 


prove the second theorem. 


Theorem 6.33 


n 


Let I = ] [[a:. il. and let f : I 


— R be a bounded function defined on I. 


i=l 
If f : I > R is continuous on the interior of I, then f : I + Ris Riemann 


integrable. 


We will show that for any « > 0, there is a partition P of I such that 


BGP 12) = GE IP) = = 


Since f : I > R is a bounded function, there is a positive number / such 


that 


eS] Sw for all x € I. 


By Lemma 6.18, there is a close 


interior of I, such that 


vol (I) 


nm 
d rectangle I = [[. v;] contained in the 
i=1 


x E 
— vol (I —., 
vol (1) < 77 


Chapter 6. Multiple Integrals 403 


Let Po = (Fi,..., P,) be the partition of I given by P; = {a;, u;, v;, b;} 
for1 <i <n. Thenlisa rectangle in the partition Po. 


Since f : I > R is continuous, Theorem 6.27 implies that there is a 


partition P of I such that 


U(f, P1) = LG oes) < =. 


Let P be the partition of I so that it contains all the partition points in Po 
and P,. Then P is a refinement of Py and the partition that P induces on I 


is P(I) = P,. By Proposition 6.3, 
U(f,P)-L(f,P)= >> U(f,P(D) - LF, P(J))) 
JETpy 


=U(f,P:)-Lf,Pi)+ YS) (UF, P)) - L(F,P(J))). 


For each J in Jp, \ {1}, we use the crude estimate 


U(f, P(J) — L(f, P(J)) < 2M vol (J). 


These imply that 
U(f,P) — L(f,P) < = +2M S> vol (J) 
Je Jp, \{I} 


= - + 2M (vol (I) — vol (I)) <e. 


Set of Discontinuities of a Function 


Given a function f : A — R defined on the set A, the set of discontinuities 
of f is the set of all points x, in A such that f is not continuous at xo. 


If B is a subset of A, and xo is a point of B, f : A > R is continuous 


at Xo implies that f : B — R is continuous at xo. Hence, the set 


of discontinuities of the function f : B — R is a subset of the set of 


discontinuities of the function f : A > R. 


Chapter 6. Multiple Integrals 404 


Theorem 6.34 


n 


etl — ] [la b;]. Given that f : I + R is a bounded function defined on 


— 
I, let Ny be the set of discontinuities of f : I + R. If NV is a set that has 
Jordan content zero, then f : I + R is Riemann integrable. 


We will show that for any ¢« > 0, there is a partition P of I such that 
Since f : I > R is a bounded function, there is a positive number 7 such 
that 


lf (x)| <M for all x € I. 


Since NV; is a set of Jordan content zero that is contained in I, there are 
closed rectangles I,,..., I, such that 


j 


By Lemma 6.29, there is a partition Py of I such that each rectangle J in the 
partition Po is either contained in an I; for some 1 < 7 < k, or is disjoint 
from the interiors of I; for all 1 <j <k. 
Let 

1) = fp, | J Gd, forsome 4 <4, - 


and 
B={IJ€ Jp, |INint(1;) =O forall 1 <j < k}. 


Assume that & contains N rectangles. If J €¢ &, f : J > R is continuous 


on the interior of J. By Theorem 6.33, f : J — R is Riemann integrable. 


Therefore, there is a partition Py of J such that 


CG eae al 


Chapter 6. Multiple Integrals 405 


The rest of the proof is similar to the proof of Theorem 6.33. Let P be the 
partition of I which contains all the partition points in Po and Py for all 
J € &. Then 


U(f,P)—L(f,P)= >> U(F,P(D) — LF, P))) 


JETp_ 


=> U(F,P(D)) - LF, P(D))) + 5) (UC, P()) - L(f, PS). 


Jed JEB 


For each J in 2%, we use the crude estimate 
U(f,P(J) — L(f,P(J)) < 2M vol (J). 


Using the fact that 


JEW 


For each J € &, P(J) is a refinement of P;, and thus 


U(f,P(J)) — Lf, P(J)) 


This implies that 


S- (U(f, P(J)) — L(f, P(J))) < 


JEB 


These give us U(f,P) — L(f, P) < ¢, as desired. 


Now we can prove the following characterization of Jordan measurable sets. 


Chapter 6. Multiple Integrals 406 


Theorem 6.35 


Let D be a bounded subset of R”. The following are equivalent. 


(a) is a Jordan measurable set. 


(b) The boundary of D has Jordan content zero. 


Let I be a closed rectangle that contains D. By definition, D is Jordan 


measurable if and only if the function yp : I — R is Riemann integrable. 


By Theorem 6.28, the set of discontinuities of the function ya : I > Ris 
the set OD. If the boundary of D has Jordan content zero, Theorem 6.34 
implies that x5 : I + Ris Riemann integrable. This proves (b) implies (a). 


Conversely, if © is Jordan measurable, given ¢ > 0, there is a partition P 
such that 


€ 
Ui(Xo, P) = L(xo, BP) < 5° 


For each J in Jp, there are only three possibilities for the pair (my, M3). 
Namely, (1, 1), (0,0) or (0, 1). Let 


We = AE Os ea i 
4={Je Jp|myj = Mz —05, 
© —{J ep |my—0; My —1). 


Then Jp = HU BUG, and we have 


U(xo,P) — L(xa,P) = S— (Ms — ms) vol (J) = 5 vol (J). 
JETp JEG 


This implies that 


S_ vol (J) eee 


Jee 
Notice that J is in © if and only if f(x) = 1 for all x € J, if and only if 
J C ®. This implies that 


int J C intD forall Je &. 


Chapter 6. Multiple Integrals 407 


Similarly, J is in Z if and only if f(x) = 0 for all x ¢€ J, if and only if 
J C R” \ 9. This implies that 


int J C extD forall Jc &. 


S= U OJ. 


JEAUB 


IR” is a disjoint union of int, ext and OD, we must have 


m<(U3] US. 


Jee 


Since the boundary of a closed rectangle has Jordan content zero, and & U 
Z is a finite set, Theorem 6.32 implies that S has Jordan content zero. 
Hence, there is a finite collection of rectangles Y = {I,|1 <j < k} such 
that 


k k 
SC\(JI; and 5 vol (Ij) < =. 
j=l j=l 
Let & = @ U J. Then @ is a finite collection of closed rectangles, 


aac and S “vol (J) <e. 


Jee Jee 


This shows that D has Jordan content zero. 


Using Theorem 6.35, we can obtain more examples of Jordan measurable sets. 
First we prove the following. 

Lemma 6.36 

Let A and B be subsets of R”. Then 


O(AUB)COAUOB,  0(ANB) CAAUAB. 


Chapter 6. Multiple Integrals 408 


If xo be a point in O(A U B), there is sequence of points {u;,} in AU B 
that converges to xo. Each point in this sequence is either in A or in B. 


Therefore, there is a subsequence {u,,} that is in A or in B. There is also 


a sequence {v;} in R” \ (A U B) that converges to xo. This sequence is in 
both R” \ A and in R” \ B. Therefore, xo is in OA or in OB. 

If xo is a point in 0(AMB), there is sequence of points {u,,} in R”\ (ANB) 
that converges to xo. Each point in this sequence is either in R” \ A or in 


IR” \ B. Therefore, there is a subsequence {u,,} that is in R” \ A or in 


IR” \ B. There is also a sequence {v;,} in AN B that converges to x. This 
sequence is in both A and B. Therefore, xo is in OA or in OB. 


One is tempted to think that 0/AM B) C OAN OB. But this is not true, as 
shown in the following example. 


Example 6.27 


Let A = [0,2] x [0,2] and B = [1,3] x [1,3]. We find that AN B = 
[1,2] x [1,2]. As shown in Figure 6.17, 0A OB is a set with 4 points, 
O(AN B) 4 OAN OB, but 0(AN B) C OAV OB. 


Figure 6.17: (AN B) 4 OAN OB, but 0(AN B) C OAUOB. 


Using Lemma 6.36, we obtain the following. 


Chapter 6. Multiple Integrals 409 


Theorem 6.37 


If D1, Do, ..., D, are Jordan measurable sets, then the set 9,1 D2M---M 
®D,, and the set, UD. U--- UD, are also Jordan measurable. 


It suffices to prove the case where m = 2. The general case follows by 


induction. 

If D, and D2 are Jordan measurable, Theorem 6.35 says that OD, and 0D» 
have Jordan content zero. Theorem 6.32 says that 0D, U OD, has Jordan 
content zero. Lemma 6.36 and Theorem 6.31 imply that 0(, 9 D2) and 
O(D1 U D2) have Jordan content zero. Theorem 6.35 again implies that 
D1M Dz and D; U Dz are Jordan measurable sets. 


Observe that the concept of Jordan measurable sets and Riemann integrable 
functions are closely related. In a nutshell, a set © is a Jordan measurable set 


if and only if all the constant functions f : © — R are Riemann integrable. In 


fact, it is also if and only if all the continuous functions f : © — R are Riemann 
integrable. We can even allow discontinuities on a set that has Jordan content 
zero. We will prove this after a few preparatory remarks and lemmas. 


Remark 6.3 


If D is a bounded subset of IR” and ® is contained in the closed rectangle 
I, then D is also contained in I. This implies that OD is also contained in I. 


The following example depicts the relation between the set of dicontinuities 


of a function f : D — R and its zero extension. 


Example 6.28 


Let D = {(z,y)|0<y<-az < 2}, and let f : D - R be the function 
defined as 


if (x,y) € D andO0 < az <1, 
if (%, 4) € © and 1 =z < 2. 


Chapter 6. Multiple Integrals 410 


The set of discontinuities of the function f :D — | 


The set D is contained in the rectangle I = [0,2] x [0,2]. Let f: 14 R 
be the zero extension of f : © — R. Then the set of discontinuities of 


f:15 Ris 


Ne = 419) (0s ys 1) UG, s))0 Sax 2F, 


which is a subset of N; U OD. 


Figure 6.18: The set of discontinuities of the functions discussed in Example 6.28. 


The following lemma gives the general case. 


Lemma 6.38 
Let D be a bounded subset of IR”. Given that f : © — R is a bounded 
function defined on 9, let f : R" — R be its zero extension. Let 


N; = {xo € D| f : D > Ris not continuous at xo} , 


N; = {Xo € R"| f :IR” = R is not continuous at cae 


N; C OD U Nj. 


Chapter 6. Multiple Integrals 411 


We will show that f : R” — Ris continuous at xo if xo is notin OD U N & 
If xo € OD U Ny, there are two possibilities. 


e Xo isinext®D. 


e x, isinint® and f : 9 — Ris continuous at xp. 


If xo is in ext D, there is an r > 0 such that B(x9,r) C R” \ D. Since 
f(x) = 0 for all x € B(xo,r), f : R" > R is continuous at xp. 


If xo is in intD and f : D — Ris continuous at x9, we want to show that 


f : R" — Ris continuous at xo. Given that {x;,} is a sequence in R” that 
converges to Xo, there is a positive integer K such that x, € int® for all 
k > K. This implies that f(x,,) = f(x,) for all k > K. Since f: D2 > R 


is continuous at xo, 


dim f(xK+%) = F(Xo). 
This implies that 
iim (= iim (sag) = iim f(Xx+4%) = f (Xo) = f (Xo). 


Hence, f : R” — Ris continuous at xo. This completes the proof. 


Now we can prove the main theorem. 


Theorem 6.39 


Let D be a bounded subset of IR”. Given that f : © — R is a bounded 
function defined on D, let V y be the set of discontinuities of f : D — R. If 
® is a Jordan measurable set, and VV y is a Set that has Jordan content zero, 


then f : © — Ris Riemann integrable. 


Let I be a closed rectangle in R” that contains D, and let f : I > R be the 
zero extension of f : 2 — Rto I. We want to show that f : I > R is 


Riemann integrable. 


Chapter 6. Multiple Integrals 412 


By Lemma 6.38, the set of discontinuities of : : I > R is contained in 
OD UN, f- Since D is Jordan measurable, Theorem 6.35 says that 0D has 


Jordan content zero. Theorem 6.32 and Theorem 6.31 then imply that the 


set NV j has Jordan content zero. By Theorem 6.34, f :1— Ris Riemann 
integrable. This completes the proof. 


In particular, we have the following. 


Corollary 6.40 


Let D be a subset of IR” that is Jordan measurable, and let f : 9 — R be 
a bounded function defined on D. If f : © — R is continuous, then it is 


Riemann integrable. 


Let us emphasize the result in Corollary 6.40. 


Riemann Integrability of Continuous Functions 
Any continuous function defined on a Jordan measurable set is Riemann 
integrable. 


Another interesting corollary of Theorem 6.39 is the following. 


Corollary 6.41 


Let D be a subset of IR” that has Jordan content zero. If f : 9 — Risa 
bounded function defined on 9, then f : D — R is Riemann integrable 


[r= 


Since D has Jordan content zero, and the set \V. y Of discontinuities of f : 


and 


D — Ris asubset of D, V. y has Jordan content zero. Theorem 6.39 implies 
that f : © — Ris Riemann integrable. Since f : D — R is bounded, there 
is a positive number V/ such that —M < f(x) < M forall x € D. 


Chapter 6. Multiple Integrals 413 


By monotonicity theorem, 


[i -Max < [tee < [Max 


For any constant c, linearity implies that | cdx = cvol(D) = 0. This 
2 


[r= 


Let us highlight this important result. 


proves that 


Bounded Functions Defined on Sets that has Jordan Content Zero 
Any bounded function defined on a set that has Jordan content zero is 
Riemann integrable with integral zero. 


Corollary 6.41 also gives the following. 


Lemma 6.42 


Let D be a bounded subset of IR”, and let f : © — R be a bounded function 
defined on D. If there is a subset A of D with Jordan content zero such that 
f(x) = 0 for allx €  \ A, then f : 9 — R is Riemann integrable and 


[ot 


D 


Let I be a closed rectangle in R” that contains D, and let f :1—- Rbe the 
zero extension of the function f : 9 — R. Notice that it is also the zero 
extension of the function f|4 : A — R. By Corollary 6.41, f|4: A> R 


is Riemann integrable and | f|,4 = 0. Hence, f : 1 Ris Riemann 
A 
integrable, and so is f : © — R. Moreover, 


[y= [i= [ma=0 


Using Lemma 6.42, we obtain the following important result, which says that 


Chapter 6. Multiple Integrals 414 


Riemann integrability is not affected by the definition of the function on a set that 


has Jordan content zero. 


Theorem 6.43 


Let 9 be a bounded subset of IR”, and let f : D > Randg: D > Rbe 
bounded functions defined on D. If f : © — Ris Riemann integrable and 


there is a subset A of D which has Jordan content zero such that 


G(x) forallx € D\ A, 


then g :  — Ris Riemann integrable and 


ho= he 


Sketch of Proof 
Let h : D — R be the function h(x) = f(x) — g(x). Then h(x) = 0 for 
allx € ® \ A. By Lemma 6.42, h : D — R is Riemann integrable and 


h = 0. The assertion follows from linearity. 
2 


Using Theorem 6.43, we can generalize additivity to arbitrary sets. 


Theorem 6.44 Additivity 


Given that ©, and Dz are bounded subsets of IR” such that D; 1 Dz is a set 
that has Jordan content zero, let 9 = 9, U Dg. Assume that f :D > R 
is a bounded function defined on 9. If the functions f : 9, — R and 


f : S2 — R are Riemann integrable, then the function f : 9 — R is 


Riemann integrable and 


aes 


Chapter 6. Multiple Integrals 


415 


Let I be a closed rectangle in R” that contains 9, and let Dp = D1 M Do. 


We are given that Dp has Jordan content zero. Let f : 


cc he 


fo: 1— Rand fy : 1 > R be resepctively the zero extensions of f : D > 
R, f: 91 RR, f : D2 Rand f : Dp — R. It is easy to see that 


for all x € I. (6.3) 


By Corollary 6.41, f : Mo — R is Riemann integrable and ia 
Do 


Hence, fp : I + R is Riemann integrable. 


Since f : 9, — Rand f : D2 — R are Riemann integrable, Ay :I-R, 


R is Riemann integrable and 


f : 1 > R are Riemann integrable. Linearly and (6.3) imply that f : I > 


fr=frrfr-frefre fs. 


By induction and Theorem 6.32, we obtain the following. 


Theorem 6.45 


Given that 9), Do, ..., MD», are bounded subsets of 


R” such that for any 


pairs of (7,7) with i 4 7, 9; MD, has Jordan content zero, let D = D; U 
Dy U---U®D,,. Assume that f : D — R is a bounded function defined on 
D. If the functions f : 9; — R, 1 < j < m, are Riemann integrable, then 


the function f : ® — Ris Riemann integrable and 


i J . lees J 7 D1 J i Do J a 


Remark 6.4 


+ ff. 
Dm 


If a function f : © — R is Riemann integrable and D, is a subset of D, 


f : 91 — Ris not necessarily Riemann integrable. For example, consider 


the constant function f on D = [0, 1)” which takes value 1. Its restriction 


to D,; = DN Q" is not Riemann integrable. 


Thus for the general additivity theorem, we do not have if and only if. 


Chapter 6. Multiple Integrals 416 


Using the fact that a set D is Jordan measurable if the function yp :9 — R 
is measurable, Theorem 6.45 gives the following. 


Corollary 6.46 


Let D1, Do, ..., Dm be Jordan measurable subsets of IR” such that for any 


pairs of (7,7) with 2 # 7, 0; MD; has Jordan content zero. Then the set 


D =D) UD. U---UD,, is also Jordan measurable. Moreover, 


vol (D) = vol (M1) + vol (M2) +--+ + vol (Dm). 


Let us return to explore more on Jordan measurable sets. So far we only know 
explicitly that a closed rectangle is Jordan measurable, and some examples of sets 
that have Jordan content zero. Since a bounded subset D is Jordan measurable 
if and only if its boundary has Jordan content zero, we will first explore sets that 
have Jordan content zero. The following theorem will give us a lots of examples 
of sets that have Jordan content zero. 


Theorem 6.47 


Let 9 be a Jordan measurable subset of IR”, and let f : D — | 
Riemann integrable function. Then the graph of f defined by 


RR’ | xe Ory — f(x)} 


is a subset of R”*! that has Jordan content zero. 


Let I be a closed rectangle in R” that contains D, and let f : I > R be the 
zero extension of f : D — R. Fixed e > 0. Since f : 9 — R is Riemann 


integrable, there is a partition P of I such that 


Chapter 6. Multiple Integrals 417 


Then 7) > 0. Let 


A = {Ix |my—7, My +7] |J € Fe}. 


Then .& is a finite collection of closed rectangles in R"*!. If (x, f(x)) is 
in Gy, there is a J € Jp such that x € J. Then my < f(x) < My implies 
that (x, f(x)) isin J x [my — , My + 7]. This proves that 


Gre U K. 


Kev 


SS (My z — my + 27) vol (J) 


(J) < 5 4 2nvol (I) <<. 


This proves that Gy has Jordan content zero. 


Specialize to continuous functions, we have the following. 


Corollary 6.48 


Let 9 be a Jordan measurable set in R”, and let f : ® — R be a continuous 
function. Then the graph of f defined by 


R™ |x ED,y = flo} 


is a subset of R”*! that has Jordan content zero. 


Example 6.29 


Any line segment L between two points (x1, y,) and (x2, y2) in R? has 


Jordan content zero. 
If x; = Zo, the line segment L is vertical. It is a subset of the boundary 
of the closed rectangle [x,, x; + 1] x [y1, yo]. Hence, L has Jordan content 


Zero. 


Chapter 6. Multiple Integrals 418 


If 7, A x9, L is the graph of the continuous function f : [2,272] > R, 


Figure 6.19: Line segments in the plane have Jordan content zero. 


Example 6.30 


Since the boundary of a polygon in R? is a finite union of line segments, a 
polygon is Jordan measurable. The interior of the polygon is also Jordan 
measurable as it has the same boundary. 


Figure 6.20: Polygons in the plane are Jordan measurable. 


Chapter 6. Multiple Integrals 419 


Example 6.31 


We can argue that a line segment or the part of a plane in R® contained 


inside a bounded set has Jordan content zero. A plane in R® satisfies an 
equation of the form 
ax + by+cz=d, 


where (a,b,c) # 0. Therefore, one can always solve one of the variables 
as a function of the other two. For example, if c 4 0, then 


d—ax—b 
z= f(2,y) =————*. 


Cc 


Hence, a plane is the graph of a continuous function. If we consider the 
part of the plane contained within a bounded set, then it must have Jordan 


content zero. For example, 


S=({(ew2) +7 -2—3.7 20,92 0,22 0} 


is the part of the plane x + y + z = 3 bounded inside the rectangle [0,3] x 
[0,3] x [0,3]. Hence, S' has Jordan content zero. 

A line segment in R® can always be regarded as a subset of a part of a plane 
that is contained in a bounded set. Hence, it also has Jordan content zero. 


Example 6.32 


The boundary of the open rectangle U = [[. b;) is the same as the 


oe 
n 


boundary of its closure R = ] [[ai. b- Hence, U is also a Jordan 


=I 
measurable set. Since U and OU are disjoint, and their union is R, 


volU + vol OU = vol R. 
Since vol (OU) = 0, we have 


volU = vol R. 


Motivated by Example 6.32, an interesting question to ask is if the subset D 


Chapter 6. Multiple Integrals 420 


of R” is Jordan measurable, is its closure D Jordan measurable? This is answered 
in the following theorem. 


Theorem 6.49 


If D is a subset of R” that is Jordan measurable, so is D. Moreover, 


vol D = vol D. 


First we claim that OD C OD. As the closure of D, D is a disjoint union of 
intD and OD. As the closure of D, D is a disjoint union of intD and 0D. 
Since D C D, we have intD C intD. Hence, we must have OD C OD. 
If D is Jordan measurable, OD has Jordan content zero. Since OD C OD, 
OD also has Jordan content zero. Hence, D is Jordan measurable. 
For the last statement, we use the fact that) = DU OD. Notice that 
DNOD Cc OD. Hence, DM OD has Jordan content zero. By the additivity 
theorem, 

vol D + vol OD = vol D. 


Since vol OD = 0, we conclude that vol D = vol D. 


Example 6.33 


Consider the set A = (—1,0) U (0,1). Its closure is A = [—1, 1]. Hence, 
OA = {—1,1} is not equal to 0A = {—1,0,1}. This also shows that even 
for an open set A, we does not have 0A = OA. 


Remark 6.5 


If D is a bounded subset of R” such that D is Jordan measurable, one 
cannot deduce that is Jordan measurable. An example is given by D = 
[0, 1]” A Q", which is not Jordan measurable, but D = [0,1]" is Jordan 
measurable. 


Chapter 6. Multiple Integrals 421 


Example 6.34 


Let r be a positive number. We claim that the disc 
eG ee er 


and its closure are Jordan measurable sets. By Theorem 6.49, it is sufficient 
to show that D is Jordan measurable. Notice that 


OO yaar te oe, 


where 


$= {(e.y) |-1<0<1y=tvP—#}. 


S', are the graphs of the functions f, : [—1,1] > R, 


fala) = VFA 


Since |—1, 1] is a Jordan measurable set in R, and fs : [—1,1] > R are 


continuous functions, Corollary 6.48 implies that S; = Gy, have Jordan 


content zero. Therefore, D is Jordan measurable. 


+ ®@ 


Figure 6.21: An open ball and its closure are Jordan measurable. 


More generally than the open balls, we have the following. 


Chapter 6. Multiple Integrals 422 


Example 6.35 


Let [a, b] be a closed interval in R, and let f : [a,b] + Randg: [a,b] >R 
be continuous functions satisfying f(x) < g(x) for all x € [a,b]. Define 
, and Dz to be the sets 


Show that D, and D» are Jordan measurable sets. 


Solution 


Since f : [a,b] > Rand g : [a,b] — R are continuous, D, is open and 


D>» is closed. Since [a,b] is compact, they are bounded. There is a positive 
number / such that 


Ge) se lg(xz)| < M for all x € [a, }]. 


and let S = S; US, US3 U Sy. Then 
Do 27 au 
Since ,; is open and D; C Do, 
D, = intD, C int Dp. 


Since D» is closed and D; C Do, 


Chapter 6. Multiple Integrals 423 


Therefore, 


OD, =D, \intD, CD2\Di Garey 
OD = Do \ int Dy CD_\Di (else 


Since S; and S5 are line segments, they have Jordan content zero. Since S3 
and 5S, are graphs of continuous functions defined on the Jordan measurable 
set [a, b], S3 and S4 also have Jordan content zero. These imply that S' has 
Jordan content zero. Thus, 0D, and 0D, also have Jordan content zero, 
which imply that D,; and D2 are Jordan measurable sets. 


Figure 6.22: The set D = {(z,y)|\a<a<b, f(x) <y<g(x)} is Jordan 


measurable. 
More generally than Example 6.35, we can prove the following. 


Theorem 6.50 


Let U/ be a Jordan measurable set in R”, and let f :U¢ > Randg:U—>R 
be bounded continuous functions on // satisfying f(x) < g(x) for all x € 
U. Then the subsets 


Sh = es Mie ees) =< OE 
Do = {(x,y)|x €U, f(x) < y < g(x)} 


R”*! are Jordan measurable. 


Chapter 6. Multiple Integrals 424 


Sketch of Proof 
Let M be a positive number such that 


lf(x)| <M and |g(x)|< M for allx EU. 


The sets OD, and OD, are contained in the set S = S; U Sz U S3, where 
= {(x,y)|x € 0U,-M <y <M}, 
{(x,y)|xEU,y = f(x)}, 
{(x,y) |x €U,y = g(x)}. 


The sets 5), S_ and S3 have Jordan content zero. 


Example 6.36 


We claim that an open ball B(xo,r) in R” and its closure are Jordan 
measurable sets. It is sufficient to consider the case where x) = O and 
r = 1. Let B” = B(0,1). We will show that B” is Jordan measurable by 
induction on n. Then B” is also Jordan measurable. 

When n = 1, B' = (—1, 1) is an interval whose boundary is the two point 
set {—1, 1} which has Jordan content zero. For n > 1, assume that B” is a 


Jordan measurable subset of IR”. Notice that 


(a= (Con) (oe Ie aC ean = lea 


. : B” — R are bounded continuous functions defined by 


felts, ...,q)=4yf1— 23-22. 


By inductive hypothesis, B” is Jordan measurable. By Theorem 6.50, B"*! 
is also Jordan measurable. 


Now we give some examples of Riemann integrable functions. 


Chapter 6. Multiple Integr. 


als 425 


Example 6.37 


Let D = {(x,y,z) |x? 
function defined as 


+y? <4,-3 < z < 3}, and let f : D — Rbe the 


Fe i ay EO 


Explain why f : © — Ris Riemann integrable. 


The set 7 — { (2,752) 


Solution 


|x? + y? < 4} is an open ball. Hence, it is Jordan 


measurable. The functions g; : U — R, g(x,y) = +3 are continuous 


functions. Theorem 6.50 implies that D is Jordan measurable. The function 


f(a, y, 2) = x? + 4y? + 92? is a polynomial. Hence, it is continuous on D, 


and hence, it is bounded and continuous on 9. Therefore, f : 2 — R is 


Riemann integrable. 


Figure 6.23: The Jordan measurable set in Example 6.37. 


Example 6.38 


Let D = {(z,y)|2? + 


PO 
(Papi ine 2h 


Explain why f :D — | 


y” < 1}, and let f :D — R be the function 


2 if ¢ =< y, 


IR is Riemann integrable. 


Chapter 6. Multiple Integrals 426 


Solution 
The set © is an open ball. Hence, it is Jordan measurable. The set of 


discontinuities of the function f : © — R is contained in the line segment 


L from the point (—1, —1) to the point (1,1). Since L has Jordan content 


zero, f : © — R is Riemann integrable. 


Figure 6.24: The Jordan measurable set in Example 6.38. 


At the end of this section, let us prove a few interesting theorems. The next 
theorem shows that a set with Jordan content zero cannot have nonempty interior. 


Theorem 6.51 


Let D be a subset of R” that has Jordan content zero. Then intD = 0. 


We use proof by contradiction. If int) # Q, it is an open set that contains 


at least one point u = (uw1,...,Un). By definition of interior points, there 
exists r > 0 such that B(u,r) C D. There exists 6 > 0 such that the 
n 


rectangle I; = [ [lui — 6, u; + 0] is contained in B(u,r), and hence in D. 


il 
Since © has Jordan content zero, we find that I; also has Jordan content 


zero. But 
Vou 20) a): 


This gives a contradiction. 


Chapter 6. Multiple Integrals 427 


The next one is the mean value theorem for integrals. 


Theorem 6.52 Mean Value Theorem for Integrals 


Let 9 be a closed and bounded Jordan measurable set in R”, and let f : 


® — R be a continuous function. If D is connected or path-connected, 


then there is a point Xo in D such that 


[ fe0ax= f (xo) vol (D). 
2 


Since D is compact and f : D — R is continuous, extreme value theorem 
asserts that there exist points u and v in D such that 


(eS hy) for allx € D. 


Since 9 is a Jordan measurable set and f : D — R is continuous, f :D — 


R is Riemann integrable. The monotonicity theorem implies that 


f(u) vol (D) < [ f(x)ax < f(v) vol (D). 


If vol (9) = 0, we can take xp to be any point in D. If vol (D) ¥ 0, notice 
that 


satisfies 
f(s e= flv): 


Since D is connected or path-connected, and f : © — R is continuous, 


intermediate value theorem asserts that f(©) must be an interval. Since 
f(u) and f(v) are in f(®) and c is in between them, c must also be in 
f(®). This means that there is an xp in D such that 


1 
waiwy f, fede = = Fe). 


Chapter 6. Multiple Integrals 428 


Exercises 6.3 


Question 1 


Explain why the set 
D = {(x,y, z) |40? + y? + 92? < 36} 
is Jordan measurable. 


Question 2 


Explain why the set 


S= \(0,2) |e 0,7 S02 0 87 4y 62 = 12} 


is Jordan measurable. 


Question 3 


Let D = {(2,y,2z)|a* +y?+ 27 =25}, and let f : D > R be the 
function defined as 


if x, y and z are rational, 
f(x,y, 2) = 


otherwise. 


Explain why f : © — Ris Riemann integrable and find i; fs 
ca) 


Question 4 


Let D = {(2,y,z)|a?+y?+27 < 25}, and let f : D > R be the 
function defined as 
pe — zeleyl, 


Explain why f : © — Ris Riemann integrable. 


Chapter 6. Multiple Integrals 429 


Question 5 


Let D = (0, 2] x (—2,5) x (1, 7], and let f : © — R be the function defined 
as 


zt+y, ie 
J (2,9; 2) — 
2u — y, Tae 


Explain why f : © — Ris Riemann integrable. 


Question 6 
Let D = {(2,y, z)|4a7 + 9y? < 36,0<2z<27?+y7}, and let f:D 
R be the function defined as 

ae Wie Ea 

Yag, ia 2 yee z. 


cAGagie) = 


Explain why { : © — Ris Riemann integrable. 


Question 7 


If D is a Jordan measurable set that is contained in the closed rectangle I, 
show that I \ D is also Jordan measurable. Moreover, 


vol (I \ D) = vol (I) — vol (®). 


Question 8 
If D; and Dz» are Jordan measurable sets and Dz is contained in D1, show 


that D, \ Dz is also Jordan measurable. Moreover, 


vol (1 \ D2) = vol (D1) — vol (D2). 


Question 9 


If D is a Jordan measurable set, show that int D is also Jordan measurable. 


Moreover, 
vol (int) = vol (D). 


Chapter 6. Multiple Integrals 430 


Question 10 


Let 9, and D2 be Jordan measurable sets in IR” and R” respectively. 
Assume that 9, has Jordan content zero. Show that the set D = 9, x Do 
in R™*” also has Jordan content zero. 


Question 11 


Let D = {(x,y)|2*+y? < 9}. Show that the integral | xdaxdy exist and 


; 2 
is equal to 0. 


Question 12 


let D be a Jordan measurable set, and let f : © — R be a Riemann 


integrable function. If g : D — R is a bounded function such that 
g(x) = f(x) for all x in D, show that g : D — R is Riemann integrable 


and 


a 


Chapter 6. Multiple Integrals 431 


6.4 Iterated Integrals and Fubini’s Theorem 


In Section 6.3, we have given a sufficient condition for a function f : 2 — R to 
be Riemann integrable. 


Riemann Integrable Functions 


If © is a subset of R” such that a constant function on © is Riemann 
integrable, then any bounded function on © whose set of discontinuities 


is a set that has Jordan content zero is Riemann integrable. 


However, we have not discussed any strategy to compute a Riemann integral, 
except by using a sequence of partitions {P,,} with 


k-0o 


This is a practical approach if one has a computer, but it is not feasible for hand 
calculations. Besides, it might also be difficult for us to understand the dependence 
of the integral on the parameters in the integrand. When n = 1, we have seen 
that the fundamental theorem of calculus gives us a powerful tool to calculate a 
Riemann integral when the integrand is a continuous function that has explicit 
antiderivatives. To be able to apply this powerful tool in the multivariable context, 
we need to relate multiple integrals with iterated integrals. This is the topic that is 
studied in this section. 


As a motivation, consider a continuous function f : [a,b] x [c,d] + R defined 
on the closed rectangle I = [a, 6] x [c, d] in R?. If P = (P,, P2) is a partition of I 
with 


P= {RG Aigsd4 2a} aid Pae—4 Up Vises cp Ui, 


there are k/ rectangles in the partition P. Denote the rectangles by J;,; with 1 < 
i<kand1 <j <1, where 


Jig = eis | x [yj—15 Ys) 


Choose a set of intermediate points A = {a;} for the partition P,, and a set of 
intermediate points B = {(;} for the partition P,. Let 


E55 = (a, 83) for l a7 k,l 4 <i. 


Chapter 6. Multiple Integrals 432 


Then C = {&;j Pee ce ee es lt is a choice of intermediate points for the 
partition P. The Riemann sum R(f, P, C) is given by 


k l 


R(f,P,C) = » » Negi ee = tia) — ea) (6.4) 


i=1 j=l 
Since it is a finite sum, it does not matter which order we perform the summation. 
For fixed x € [a, }], let g, : [c,d] + R be the function 


ge(y) = f(z, 9), y € [c,d]. 


If we perform the sum over 7 in (6.4) first, we find that 


R(f,P,C) > (2 sais Bi)(Yy — Wi- ») (xi — 24-1) 


i=l 
k 


= as Ful Gis Py, B)(a; =— Be Ji 
i=1 


Since ga, : [c,d] + R is continuous, it is Riemann integrable. Therefore, 


d 
lim R(go;, P2,B) = i Ja, (y)dy. 


|P2|0 
This prompts us to define the function F' : [a,b] > R by 
F(a) = fi sat yay = f° Hedy 
Then 
k 
lim, AU fP.G)= lim 9 fe(Gais Pe, B)(x i — U1) 
| P2|+0 iy Pel? 


= dL Floi)(ai =t4)= AUP, A): 


If F': [a,b] + R is also Riemann integrable, we would have 


lim lim A(s,P;C)= nie lh P,, A) 


|Pi|+0 | P2|+0 
-[[rose=f (fone) 


Chapter 6. Multiple Integrals 433 


Interchanging the roles of of x and y, or equivalently, summing over ? first in (6.4), 
we find that 


d / pb 
lim lim R(f,P,C) -/ (| F(e.y)de) dy. 


|P2|0 |Pi|0 


| ~ | ~ 


Figure 6.25: Given a partition P of a rectangle, one can sum over the rectangles 
row by row, or column by column. 


Since 
IP] = VIP? + [Pal 
|P| — 0 if and only if (|P;|,|P2|) — (0,0). The question becomes whether the 


two limits 


lim lim R(f,P,C) and lim lim R(f,P,C) 


|Pi|+0 |P2|+0 |P2|30 |Pi|30 


are equal; and whether they are equal to the limit 


lim — R(f,P,C). 


i 
(|Pi|,| P21) (0,0) 


Remark 6.6 


IR be the function defined as 


f(z,y) = (x,y) €] 


ge? + y?’ 


Chapter 6. Multiple Integrals 434 


We find that 


lien limi f(y) = Te =) 


y0 x0 


Lina lien f(a, 2) — lim sie 


«z—+0 y0 


lim lim f(a, y) 4 lim Lin Tw): 


y—0 x0 
This example shows that we cannot ee interchange the order of limits. 
In fact, the limit 
lim f(x,y) 


(x,y) (0,0) 


does not exist. 


The integrals 


[Cf tenner) a and [Cf teeny) a 


are called iterated integrals. 


Definition 6.17 Iterated Integrals 


Let n be a positive integer larger than 1, and let / be a positive integer less 


than n. Denote a point in R” by (x,y), where x € R* and y € R™*, 


Given that I = ] [la: b;| is a closed rectangle in R”, let 
i 


n 


y= [[ la.) 


a i=k+1 


If f : I > Ris a bounded function defined on I, an interated integral is an 


integral of the form 


| f(x, y)dxdy or i f(x, y)dydx, 
Tre 1, Jy 


whenever they exist. 


Let us consider the following example. 


Chapter 6. Multiple Integrals 435 


Example 6.39 


Let g : [a,b] > Rand h : [a,b] — R be continuous functions defined on 
[a, b] such that g(x) < h(x) for all x € [a, b]. Consider the set D defined as 


D = {(2,y)|a <a <b,g(e) Sy <h(a)}. 


R? — R be the corresponding characteristic function. If 


Cog for all x € [a, b], 


then I = [a,b] x |c, d] is a closed rectangle that contains D. We have seen 


that is a Jordan measurable set. Hence, yp : I — R is a Riemann 
integrable function. For any x € [a, b], 


d h(a) 
/ x(t, y)dy = [. dy = h(x) — g(x). 


b d 
Therefore, the iterated integral | ( i Xo(z, Hay) dx is equal to 


i: ([* xo. s)au) dz = [ Heme ae 


In single variable calculus, we have learned that the integral 
b 


/ (h(x) — g(x)) dx gives the area of ©. Thus, in this case, we have 


b d 
/ (/ va(e.v)dy) de = vol(®) = f o potas, 
a Cc a,b] x [c,d 


Namely, the iterated integral is equal to the double integral. 


The following theorem is the general case when n = 2. 


Chapter 6. Multiple Integrals 436 


a 


Figure 6.26: The domain 9 = {(z,y)|a<a<b,g(x) <y <A(a)}. 


Theorem 6.53 Fubini’s Theorem in the Plane 


Let I = [a,b] x [c,d], and let f : I > R be a Riemann integrable function. 
For each x € [a,b], define the function g, : [c,d] — R by 


guy) = f(z,y), y € |e, ad]. 


If gx : [c,d] + Ris Riemann integrable for each x € [a, b], let F: [a,b] > 
R be the function defined as 


Fe)= f oty= f° sewn 


Then we have the following. 


(a) The function F' : [a,b] + R is Riemann integrable. 


(b) The integral of f : [a,b] x [c,d] — R is equal to the integral of F : 


[a, b] + R. Namely, 


b 
| f=) 2 
[a,b] x [c,d] a 


Equivalently, 


b d 
/ fle, y)dedy = | / f(a, y)dyda. 
[a,b] x [c,d] a @ 


Chapter 6. Multiple Integrals 437 


I= i f(x, y)dady. 
[a,b] x [c,d] 


We will show that for any « > 0, there exists 0 > 0 such that if P is a 
partition of [a,b] with |P| < 6, and A = {a;} is any set of intermediate 
points for P, then 

[Ee oA oa | ee 


This will prove both (a) and (b). 
Fixed « > 0. Since f : [a,b] x [c,d] — R is Riemann integrable with 


integral J, there exists J) > 0 such that if P = (P,, P2) is a partition of 
[a, b] x [c, d| with |P| < do, then 


U(f,P) -—L(f,P)<e 


Take 6 = 60/2. Let P = {x0,%1,...,2,} be a partition of [a, b] with | P| < 

0. Take any partition P, = {yo, y1,..., yw} of [c, d] such that | P| < 6. Let 

P = (P,, Pr), where P, = P. Then |P| < 26 < 69. For1 <i<k,1< 
Heelet 

inf Cee 

ea onsale ) 


sup f(z, y). 
(x,y) €[ei-1,04] x [yj—1,5] 


i= yee ey tea 


i=1 i 


Side SE Op = tae 


y— leg —i 


Now let A = {a;} be any choice of intermediate points for the partition 
P = P,. Notice that for any 1 <7 < k, additivity theorem says that 


F(ai) -rf" Gos (Y on => f° * floss) 


peal 


Chapter 6. Multiple Integrals 438 


Since 
it SINC) A for ally € [y;-1, yy], 


we find that for 1 <j <1, 


Ys 
Me — Used) Ss / F(a, y)dy < Mij(y: — y;-1)- 
Yj-1 


It follows that 


gal 


Since we also have 


we find that 

|R(P, P, A) —1| < U(f,P) — Lf, P) <e. 
This completes the proof. 
Example 6.40 


Evaluate the integral | x sin(xy)dxdy. 
[0,1] x [0,1] 


Solution 


The function f : [0,1] x [0,1] > R, f(z,y) = xsin(ay) is a continuous 
function. Hence, it is Riemann integrable. 


Chapter 6. Multiple Integrals 439 


For each x € [0,1], the function g, : [0,1] > R, g.(y) = x sin(zy) is also 


continuous. Hence, g, : [0,1] — R is Riemann integrable. By Fubini’s 
theorem, 


1 pl 
| xsin(ry)drdy = | if x sin(ay)dydx 
[0,1] x [0,1] 0 0 


= [costo ih dr 
= [ (1 — cos x) dx 


=1-—|sinz|, =1—sin1. 


The roles of x and y in Fubini’s theorem can be interchanged, and we obain 
the following. 


Corollary 6.54 


Assume that f : [a,b] x [c,d] — R is a Riemann integrable function such 
that for each x € [a,b], the function g, : [c,d] > R, g.(y) = f(x,y) is 
Riemann integrable; and for each y € [c,d], the function h, : [a,b] > R, 


hy(x) = f(x,y) is Riemann integrable. Then we can interchange the order 
of integration. Namely, 


[Pf senandy = [ [ fesnaue. 


Example 6.41 


1 1 
If we evaluate the iterated integral i i! x sin(xy)dady directly, it would 


one 
be quite tedious as we need to apply integration by parts to evaluate the 
1 


integral xsin(xy)dx. Using Corollary 6.54, we can interchange the 


0 
order of integration and obtain 


‘2 Ail tpl 
| | x sin(xy)dxrdy = | i x sin(xy)dydz = 1 — sin 1. 
0 Jo 0 Jo 


Chapter 6. Multiple Integrals 440 


Remark 6.7 


The assumption that f : [a, b] x [c, d] + Ris Riemann integrable is essential 
in Fubini’s theorem. It does not follow from the fact that for each x € [a, b], 


the function g, : [c,d] — Ris Riemann integrable, and the function F' : 
[a,b] > R, 


Pa) = fF gelu)ay 


is Riemann integrable. For example, let g : [—1, 1] > Randh: [—1,1] > 


R be the functions defined as 


if x is rational, if a= 0, 


if x is irrational, if y <0. 


Then define the function f : [—1,1] x [-1,1] > 


f(x,y) = g(x)hly). 


Since h : [—1, 1] + Risa step function, it is Riemann integrable and 


[ h(y)dy = i h(y)dy + i Cpa 0) 


il 1 


Hence, for fixed x € [—1, 1], 


[ feviy=o. 


Thus, the function F’ : [—1,1] > R 


F(x) = a i euay, 


being a function that is always zero, is Riemann integrable with integral 0. 


It follows that nd 
/ | f(x, y)dydx = 0. 
A/a 


However, one can prove that the function f : [—1,1] x [—1,1] > R is 


not Riemann integrable, using the same way that we show that a Dirichlet’s 


function is not Riemann integrable. 


Chapter 6. Multiple Integrals 44] 


The fact that for each x € [a,b], the function g, : [c,d] > R, gz(y) = 
f(x,y) is Riemann integrable also does not follow from the fact that f : 


[a, b] x [c, d| +> Ris Riemann integrable. Consider for example the function 
f :(-1,1] x [0,1] > R, 


L if x = 0 and y is rational, 
f(x,y) = 


0, otherwise. 


Then the set of discontinuities NV’ of f : [—1,1] x [0,1] — R is the line 
segment between the point (0, 0) and the point (0, 1). Hence, V has Jordan 


content 0. Therefore, f : [—1,1] x [0,1] + R is Riemann integrable. For 


x = 0, go : [0, 1] > R is the Dirichlet’s function. Hence, go : [0,1] > R is 


not Riemann integrable. 


Now we consider the case depicted in Example 6.39 for more general functions. 


Theorem 6.55 


Let g : [a,b] > Rand h : [a,b] — R be continuous functions defined on 
[a, b] such that g(a) < h(x) for all x € [a,b], and let D be the set 


D={(z,y)|a<e<b,g(x) <y SAlz)}. 


R is a continuous function, then 


[ f(a, y)dedy = / / = ie peaks 


Since g and hh are continuous functions, there exist numbers c and d such 
that 


Cor hie) = for all x € [a, }). 


Then I = [a,b] x [c,d] be a closed rectangle that contains D. We have 


shown before that D is a Jordan measurable set and f : D — Ris Riemann 


integrable. Therefore, f :1— Ris Riemann integrable. 


Chapter 6. Multiple Integrals 442 


On the other hand, for each x € {a,b], the function g, : [c,d] — | 
piecewise continuous function given by 


tie Ga, 
if g(z) <y < A(z), 


MTG) ye eh 


Hence, g, : [c,d] + Ris Riemann integrable and for x € [a, }], 


d h(a) 
F(t) = if alonen= | Fenn 


g(x) 


By Fubini’s theorem in the plane, the function F’ : [a,b] + R is Riemann 
integrable, and 


[ pe iQ evar — / ’ F(a) = iL f(x, y)dxdy = [ f(a, y)dxdy. 


Again, the roles of x and y in Theorem 6.55 can be interchanged. Let us look 


at the following example. 
Example 6.42 


Let D be the region in the plane bounded between the curve y? = x and 
the line L between the points (1,1) and (4,—2). Evaluate the integral 


Hk ydady. 
D 


The equation of the line L is x + y = 2. Hence, 


Solution 


Chapter 6. Multiple Integrals 443 


Using Fibini’s theorem, we find that 


1 


1 2-y 
ih ydady = / / ydady = / y(2—y—y?)dy 
D —2 Jy? =2 


Figure 6.27: The domain D = {(z,y)| -2<y<l,y’?<ax<2-y}. 
In Example 6.42, it will be harder if one prefers to integrate over y first. 


Example 6.43 
Let 
= { (x,y) |2 >0,y > 0,42? + 9y? < 36}. 


Evaluate the integral i xdady. 
2 


Solution 


The set D can be expressed in two different ways. 


1 
(ew) Jo<e<so<y<jva— at}. 
ou) | 


DQ= 
or 


=| 


1 
0<y<2,0<0< 5/35}. 


Chapter 6. Multiple Integrals 444 


The function f : D — R, f(x,y) = x is continuous. Hence, the integral 


xdxdy is equal to iterated integrals, which we can integrate with respect 


2 
to x first, or with respect to y first. If we integrate with respect to y first, we 
find that 


3 p3V36—4a? 1p 
| Ldxcdy — | | Tyee — al rv 36 — 4a?dzx. 
2 0 Jo 0 


This integral needs to be computed using integration by substitution. If we 


integrate over x first, we find that 


2 ph /36—9y? 
| Caray — i | xdxdy 
D 0 Jo 
1 


y=0 


Ey ee 
= 5 feetapn = = fF (26 99? ay 
0 


This integral can be easily evaluated to give 


/ FORAGE) — ; [36y — 3y°] 5 =, 
2 


Figure 6.28: The domain D = {(z,y)|x > 0, y > 0, 4x? + 9y? < 36}. 


Now let us generalize the Fubini’s theorem to arbitrary positive integer n that 
is larger than 1. 


Chapter 6. Multiple Integrals 445 


Theorem 6.56 Fubini’s Theorem 


Let n be a positive integer larger than 1, and let / be a positive integer less 
n 


than n. Given that I = ] [lai is a closed rectangle in R”, let I, = 
i=1 
k n 
] [la:. 2 and I, = I] [a;, b;]. Assume that f : I > R is a Riemann 
i=1 i=k+1 
integrable function such that for each x in I,, the function g, : lL, ~ R 
defined by 


gx(y) =f(x,y), yely 


is Riemann integrable. Let F’ : I, — R be the function defined as 
P00) =f oxtvdy = | Feay)ay. 
ly ly 


Then we have the followings. 


(a) The function F’ : I, — R is Riemann integrable. 


(b) The integral of f : I — R is equal to the integral of F : I, - 
Namely, 


[socyexiy= ff te. yvavax 
I Ix JIy 


(c) For each y in I,, define the function hy : I, + R by 


fl — | eee ie 


If the function hy : I, — R is Riemann integrable for each y € I,, 
then we can interchange the order of integration. Namely, 


i f(x, y)dydx = it f(x, y)dxdy. 
Ix JIy ly JIx 


The proof is similar to the n = 2 case and we leave it to the readers. 
A useful case is the following which generalizes Theorem 6.55. 


Chapter 6. Multiple Integrals 446 


Theorem 6.57 


Let U/ be a Jordan measurable set in R”~', and let g: U4 > Randh:U > 
R be bounded continuous functions on U/ satisfying g(x) < h(x) for all 
x € U. Consider the subset D of R” defined as 


D = {(x,y) |x €U, g(x) Sy < A(x)}. 


If f : © — R is a bounded continuous function, then it is Riemann 


[Fe uaay = [ | fs De 


Let us look at an example. 


integrable, and 


Example 6.44 


Evaluate the integral | xdxdydz, where S is the solid bounded between 
S 
the plane x + y + z = I and the three coordinate planes. Then find the 
integral fe + 5y + 3z)dxdydz. 
S 


Solution 
The solid S can be expressed as 


S=1(Gy 2,7) oD 0=2 1 2a 


where 
Dy) 0 I ye 


Since © is a triangle, it is a Jordan measurable set. The function 


f(x,y, Z) = x is continuous. 


Chapter 6. Multiple Integrals 447 


Hence, we can apply Theorem 6.57. 


1-—xz-y 
[ saxdya: = | (| vd) dady 
Ss 2 \Jo 
il l-z 
= | ik x(1 — x — y)dydz 
0 Jo 
il 971-2 
= || e|a-ay- 4] dx 
0 0 


ee 3 Ly eee 
= = ie 
= jl at cae ei x)dx 
1 


©] 54. 


Since the solid S is symmetric in x, y and z, we have 


i) cards — | ydxdydz = i) zdxadydz. 
Ss S S 


Therefore, 


3 
fe + 5y + 3z)drdydz = 9 | xdcoyds — 3 
S Ss 


Figure 6.29: The solid S bounded between the plane x + y + z = 1 and the three 
coordinate planes. 


Chapter 6. Multiple Integrals 448 


Exercises 6.4 


Question 1 


Let I = (0, 2] x [0, 2], and let f : I > R be the function defined as f(x, y) = 
x'y?. For a positive integer k, let P;, be the uniformly regular partition of 
I into k? rectangles. Write down the summation formula for the Darboux 
upper sum U(f, P;,). Show that the limit iim U(f, P;,) exists and find the 
limit. 

Question 2 


Let D be the triangle with vertices (0,0), (1,0) and (1,1). Evaluate the 


integral i e” dady. 
2 


Question 3 


Let D be the region in the plane bounded between the cuve y = x? and the 


line y = 2x + 3. Evaluate the integral i: (x + 2y)dardy. 
D 


Question 4 


Let D be the region in the plane bounded between the cuve y? = 4x and 


the line y = 2x” — 4. Evaluate the integral | (x + y)daxdy. 
D 


Question 5 


1 pl 
Evaluate the integral | / V 9a? + 16 drdy. 
0 vy 


Question 6 


Evaluate the integral i xydxdydz, where S is the solid bounded between 
s 
the plane x + y + z = 4 and the three coordinate planes. Then find the 


integral i (Ary + 5yz + 62z)drdydz. 
S 


Chapter 6. Multiple Integrals 449 


Question 7 


Let f : [a,b] > R be a continuous function, and let G be the set 


G={(r 4, 0)/a—7 60-7 = |f@)|t 


in R® that lies in the plane z = 0. Rotate the set G about the x-axis, we 
obtain a solid of revolution S, which can be described as 


ye eee ay (a) 


Show that the volume of S is 


b 
vol (S) =n f(a). 


Question 8 


Let D, and D2 be Jordan measurable sets in R™ and R" respectively. Show 
that the set 9 = D, x Dz» is a Jordan measurable set in R™*” and 


vol (D1 X Dz) = vol (M1) x vol (D2). 


Chapter 6. Multiple Integrals 450 


6.5 Change of Variables Theorem 


Consider the problem of evaluating an integral of the form | f(x, y)dxdy when 


2 
D is the disc D = {(z,y)|a2?+y? <r?}. When f : D —> R is a continuous 
function, Fubini’s theorem says that we can write the integral as 


[tenacay= ff f(x, y)dydz. 


However, it is usually quite complicated to evaluate this integral due to the square 
roots. In some sense, we have not fully utilized the circular symmetry of the region 
of integration ©. For regions that have circular symmetry, it might be easier if we 
use polar coordinates (r,@) instead of rectangular coordinates (x,y). The goal of 
this section is to discuss the change of variables formula for multiple integrals. 
For single variable functions, the change of variable formula is usually known 
as integration by substitution. We have proved the following theorem in volume I. 


Theorem 6.58 Integration by Substitution 


Let w : [a,b] > R be a function that satisfies the following conditions: 


(i) ~ is continuous and one-to-one on {a, }]; 
(ii) w is continuously differentiable on (a, b); 


(iii) ~’(x) is bounded on (a, b). 


If w(|a,b]) = [c,d], and f : [c,d] — R is a bounded function that is 


continuous on (c, d), then the function h : [a,b] > R, 


h(x) = f(x) \v"(a)| 


is Riemann integrable and 


[ 40 yau= [Fw 


The function w : [a,b] — R that satisfies all the three conditions (i)—(iii) in 
Theorem 6.58 defines a smooth change of variables u = ~(a) from x to wu. 


Chapter 6. Multiple Integrals 451 


Definition 6.18 Smooth Change of Variables 


Let O be an open subset of R”. A mapping © : O — R” from O to R” is 


called a smooth change of variables provided that it satisfies the following 


conditions. 


Ga) V:O-71 


(ji) ©: O51 


IR” is one-to-one. 


IR” is continuously differentiable. 


(iii) For each x € O, the derivative matrix DW (x) is invertible. 


Remark 6.8 


If the mapping © : O — Rt" is continuously differentiable, and the 


derivative matrix DW (x) is invertible for each x € O, the inverse function 


theorem implies that the mapping VY : O — Rv” is locally one-to-one. 


However, it might not be globally one-to-one. For V : O — R” to be 


a smooth change of variables, we need to impose the additional condition 


that it is globally one-to-one. 


Example 6.45 


Let xo be a point in R”. The mapping © : R” > R”, U(x) = x + xo is 


a smooth change of variables. It is one-to-one, continuously differentiable, 


and the derivative matrix is DW(x) = [,,, which is invertible. 


Example 6.46 


Let xo and yo be points in R”, and let A be an invertible n x n matrix. The 


mapping W : 


R” + 


R” defined by 


W(x) = yo + A(x — Xp) 


is a one-to-one continuously differentiable mapping. Its derivative matrix 
is DW(x) = A, which is invertible for all x in R”. This shows that W is a 
smooth change of variables. 


The mapping W : 


R” — IR” in Example 6.46 is a composition of translations 


Chapter 6. Multiple Integrals 452 


and an invertible linear transformation. 


Example 6.47 


Let O = {(z,y)|xz>0,y > 0}, and let © : O — R? be the mapping 
defined as 


W(x, y) a Ge. a y°, 2vy). 


Show that © : O — R? is a smooth change of variables. 


Solution 
First we show that W is one-to-one. If Y(21, y,) = V(22, y2), then 


TYP =A —Yy>,  2iy1 = 2woyp. 
Let z] = 7, + ty, and z2 = ©2 + 22. Then we find that 


2 2 2 ; 2 9 2 
Zp = 2 — YH QTY, = ©Q — YQ + 2tLgyo = 29. 


Hence, we must have z2 = +21. Restricted to O, 71, Xo, y1, yo > 0. Hence, 
we must have z; = 22, or equivalently, (71, y1) = (£2, y2). This shows that 


W : O > R?’ is one-to-one. Now 


DWV(z,y) = | 


Dp iy) 
24° 2g 


is continuous, and det DW(xr,y) = 4x? + 4y? ¥ 0 for all (x,y) € O. 
This proves that W is continuously differentiable and the derivative matrix 
DW (x, y) is invertible for all (x, y) € O. Hence, © : O > R? is a smooth 
change of variables. 


In this section, we will state the change of variables theorem, and give some 
discusssions about why this theorem holds. We will also look at examples of how 
this theorem is applied, especially for polar and spherical coordinates. The proof 
of the theorem is quite technical and will be given in next section. 


Chapter 6. Multiple Integrals 453 


Theorem 6.59 The Change of Variables Theorem 


Let O be an open subset of R”, and let & : O — R” be a smooth change 
of variables. If D is a Jordan measurable set such that its closure D is 
contained in O, then Y(®) is also Jordan measurable. If f : Y(D) > Ris 
a bounded continuous function, then the function g :  — R defined as 


g(x) = f (W(x)) |det DY(x)| 
is Riemann integrable, and 


lhe ax — [seoax = ii (W(x)) |det DW(x)| dx. (6.5) 


Notice that the two vertical lines on det DW(x) in (6.5) means the absolute 


value, not the determinant. 


Remark 6.9 Jacobian 


For a mapping YW : O — R” from a subset of R” to R”, the derivative 
matrix DW(x) is also called the Jacobian matrix of the mapping Y. The 


determinant of the Jacobian matrix is denoted by 


O(W,..., Un) 
O(21,. a tee) , 


It is known as the Jacobian determinant, or simply as Jacobian. In practice, 


we will often denote a change of variables W : O > R” by u = W(x). 
Then the Jacobian can be written as 


O(ur, ae <tlg.) 
Oia ee tt) 


Using this notation, the change of variables formula (6.5) reads as 


| f(u1,.--,Un)du,-+- dun 
W(D) 


= f Flan a),---sn(a) | a 


Chapter 6. Multiple Integrals 454 


6.5.1. Translations and Linear Transformations 


In the single variable case, a translation isa map 7 : R > R, T(z) = x+c. If 


f : [a,b] + R is a Riemann integrable function, then 


b b-c 
/ f(2jar= f(a+c)dz. 


For n > 2, we have the following theorem, which is a stronger version of Theorem 
6.59. 


Theorem 6.60 


Let xo be a fixed point in R”, and let ¥ : R” — R” be the translation 
W(x) = x+ xp. If D is a Jordan measurable subset of R”, then ©(D) is 
Jordan measurable. If f : Y() — R is a Riemann integrable function, 


then g = (f o VW) : D — Ris also Riemann integrable, and 


[_teowe= [ soode= f ataite= ff ftxo)e. 66) 


Obviously, translation maps a rectangle to a rectangle with the same 
volume. Hence, it maps sets that have Jordan content zero to sets that have 
Jordan content zero. It is also obvious that YW maps the boundary of D to 
the boundary of &(D). This shows that &(D) is Jordan measurable. 

If I is a closed rectangle that contains , then I’ = I + xp is a closed 


rectangle that contains U(D). Let f : I’ + Rand g : I > R be the zero 
extensions of f : ©(@) — Rand g = (foW) : D — R respectively. Then 
g = foW. Since f : ©(D) — Ris a Riemann integrable, 


fi-fi- fi [ of 


Given asparition P-— (Pol Lilet P| be the 
partition of I induced by the translation ¥. Namely, P is a partition such 
that the rectangle J is in Jp if and only if J’ = W(J) = J + xo is in P’. 


Chapter 6. Multiple Integrals 455 


m3(g) = inf {g(x)|x € J} = inf {f(x + xo) |x € J} 
= it { f (x) | x’ eJ+ Xo} = my(f). 
It follows that 
my(g) vol (J) = S$ ° maf) vol (J’) = L(f,P’). 
JETp JETp 


Similarly, we have 
U(g,P) =U(f,P’). 


Thus, the sets S',(g) and S',(f) of lower sums of g and f are the same, and 
the sets Sy(g) and Sy(f) of upper sums of g and f are also the same. These 
imply that 


| = inf Su(g) = inf Sul 
I 


Hence, g : © — Lis Riemann integrable and 
i = ih oS f. 
2 I W(D) 


Remark 6.10 


It is easy to check that for the translation WV : R" > R”, W(x) = x+Xp, 
the change of variables formula (6.6) is precisely the formula (6.5), since 
DW (x) = J, in this case. 


Chapter 6. Multiple Integrals 456 


Figure 6.30: A translation in the plane. 


Example 6.48 
Let 
D = {(a,y) | (a — 2)? + (y +3) < 16}. 


Evaluate the integral a (4x + y)dady. 
D 


Solution 
Make the change of variables u = x — 2, v = y + 3, which is a translation. 
Then x = u+2, y = v — 3, and we have 


| (4z + y)dxdy = i (4u+8+ vu —3)dudv 
D 


u2+v2<16 


(4u + v + 5)dudv. 


tv2<16 


Since the disc B = {(u,v) | u? + v? < 16} is invariant when we change u 
to —u, or change v to —v, the integrals 


Chapter 6. Multiple Integrals 


457 


i ududv and / vdudv 
u2+v2<16 u2+v2<16 


are equal to 0. Therefore, 


| (42 + y)drdy = 5 | dudv. 
D u2+v2<16 


In single variable analysis, we have shown that the area of a disc of radius 


r is wr?. Hence, 


[ue + y)drdy = 5 x area(B) = 5 x 16a = 807. 
D 


Now we consider a linear transformation T : R” > R", T(x) = Ax defined 


by an invertible matrix A. Since DT(x) = A, the change of variables theorem 


says that for any function f :D — 


R that is bounded and continuous on D, 


I. J x)ox= [rr@) det Aldx = | det A| [ f (T(x))dx. (6.7) 


In the special case where f is a constant function, we have 


vol (T(D)) = | det A] vol (). 


A very crucial fact to the proof of the change of variables theorem is a special case 


of this formula when ® is a rectangle. 


Theorem 6.61 
Let I = ] [la:. 2). and let T : 


i=l 
linear transformation. Then 


R", T(x) = Ax be an invertible 


vol (T(I)) = | det A| vol (1). 


Linear transformations map linear objects to linear objects. However, the 


image of a rectangle under a linear transformation is not necessary a rectangle, 
n 


but is always a parallelepiped. If I is the closed rectangle ] [la b;], a point x in I 


w=1 


Chapter 6. Multiple Integrals 458 


can be written as 
x=a+t+ t1(b; = a1 )ey qe ete =p belly — Gin Jer, 


where a = (@j,...,@n), and t = (t),...,tn) € [0,1]”. Hence, we say that the 
rectangle I is a parallelepiped based at the point a and spanned by the n-linearly 
independent vectors v; = (b; — a;)e;, 1 <i<n. 


Definition 6.19 Parallelepipeds 


A (closed) parallelepiped in R” is a solid Y in R” based at a point a and 
spanned by n-linearly independent vectors v,,...,V,. It can be described 
as 


PA =f{attivit-:--t+tavnl|t = (t1,...,tn) € [0,1]"}. 


Figure 6.31: Parallelepipeds in R? and R°. 


The boundary of a parallelepiped is a union of 2n bounded subsets, each of 
them is contained in a hyperplane. Thus, the boundary of a parallelepiped has 
Jordan content zero. Therefore, a parallelepiped is a Jordan measurable set. 


If Y be a parallelepiped in R” based at the point a and spanned by the vectors 


Vi,---,;Wn, and T : R” — R” is an invertible linear transformation, then T(“) 


is the parallelepiped in R” based at the point T(a) and spanned by the vectors 
T(v1),-..-; T (vn). 

The cube [0, 1” is called the standard unit cube and it is often denoted by Q. 
If Y is a parallelepiped in IR” based at the point a and spanned by the vectors 


Vi,---;Vn, then Y = V(Q,,), where W : R"” > R" is the mapping 


W(x) = Ax+a, A=|vi|--- | vn], 


Chapter 6. Multiple Integrals 459 


which is a composition of an invertible linear transformation and a translation. 
Theorem 6.61 says that 
vol (#) = | det Al, (6.9) 


where A is the matrix whose column vectors are vj,...,V,. For example, for a 


parallelogram in R? which is spanned by the vectors 


For a parallelepiped in R* which is spanned by the vectors 


ay a2 a3 
Vi=|b1|,; Va= | dg and v3 = |b3] , 
C1 (o>) C3 


the volume of the parallelepiped is 


a, a, a3 
det by by b3 
Cy CQ C3 


These formulas have been derived in an elementary course. For general n, we 
will prove (6.9) in Appendix B using geometric arguments. This will then imply 
Theorem 6.61. 

From the theory of linear algebra, we know that an invertible matrix is a 


product of elementary matrices. Hence, an invertible linear transformation T : 


R” — IR” can be written as 


PHT, 0-0 Tg0 Ti, 


where 'T;,'T2,..., Ti is one of the three types of elementary transformations, 
corresponding to the three types of elementary matrices. 


Chapter 6. Multiple Integrals 460 


I. When F is the elementary matrix obtained from the identity matrix J, by 


interchanging two distinct rows 7 and 7, the linear tranformation T : R” — 


R”, T(x) = Ex interchanges x; and «,, and fixes the other variables. In this 
case, det # = —1 and | det E| = 1. 


II. When F is the elementary matrix obtained from the identity matrix /,, by 
multiplying row 7 by a nonzero constant c, the linear transformation T : 
R” — R", T(x) = Ex maps the point x = (21,...,%p) to 


DC) Se iy On Pa 
In this case, det F = c, and | det E| = |c|. 


Ill. When F is the elementary matrix obtained from the identity matrix /,, by 
adding a constant c times of row 7 to another row 7, the linear transformation 
T : R" > R", T(x) = Ex maps the point x = (21,...,2,) to 


T(x) = (x1, wee Li_-1, Vi + CXL 7, Vi+1, ae ee 
In this case, det F = 1, and | det E| = 1. 


Since each of the elementary transformations involves changes in at most two 
variables, it is sufficient to consider these transformations when n = 2. 


Example 6.49 


R? be the linear transformation 


T(z, y) = (y, 2). 


The matrix £ corresponding to this transformation is 


ee 
1 0 


Under this transformation, the rectangle I = [a, b] x [c, d] is mapped to the 
rectangle I’ = [c,d] x [a,b]. It is easy to see that 


vol (I’) = vol (I). 


Chapter 6. Multiple Integrals 461 


T 
, ie 


Figure 6.32: The linear transformation a UY) = (G2), 


a 


Figure 6.33: The linear transformation T(x, y) = (a, 2y). 


Example 6.50 


Let T : R? > R? be the linear transformation 
T(z, y) = (x, ky), k #0. 


The matrix £ corresponding to this transformation is 


| uF 


Under this transformation, the rectangle I = [a,b] x [c,d] is mapped to 
the rectangle I’ = [a,b] x [kc,kd] if k > 0; and to the rectangle I’ = 
(a, b] x [kd, kc] if k < 0. In any case, we find that 


vol (I') = || vol (1). 


Chapter 6. Multiple Integrals 462 


Example 6.51 


R? be the linear transformation 


T(x, y) = (x + ky, y). 


The matrix £ corresponding to this transformation is 


a=) f. 
Oa 


Under this transformation, the rectangle I = [a,b] x [c,d] is mapped to 
the parallepiped Y with vertices (a + kc,c), (a + kd, d), (b+ kc, c) and 
(b+ kd, d). Using elementary geometric argument, one can show that 


vol (#) = vol (I). 


Combining Example 6.49, Example 6.50 and Example 6.51, we conclude that 


(6.8) holds when T : R” — R” is an elementary transformation. 


The type II elementary transformations maps rectangles to rectangles, so do 
their compositions. Therefore, (6.8) also holds if the linear transformation T : 


R” — R” is a composition of type II elementary transformations. This gives the 


following. 


Theorem 6.62 


Let xo = (U1,..., Un) be a fixed point in R”, and let WU : R” be the 
mapping 


Wi (x) = aya; + uy. 


Equivalently, W(x) = Ax+ xo, where A is a diagonal matrix with diagonal 


entries Q1,...,Qn. If D is a Jordan measurable subset of IR”, then Y(D) 


is Jordan measurable. If f : ¥() — R is a Riemann integrable function, 


then h = (f o W) : D — Ris also Riemann integrable, and 


ih ee Jace Al f ene = jdet A] f f(Ax + xo)dx (6.10) 
w(2) D 9 


Chapter 6. Multiple Integrals 463 


v y 
A 
T 
Pa, 
0 = X 0 =X 


Figure 6.34: The linear transformation T(x, y) = (x + y, y). 


Bd y 
T A 
Pes 
1 = 
0 0 id 


Figure 6.35: The linear transformation T(x, y) = (a + 2y, y). 
y y 


Figure 6.36: The linear transformation T(x, y) = (x — y, y). 


Notice that det A = a1 --- Qn. If y = V(x), then 
Ys = 40; + Ui, Ll<i<n. 


The proof of Theorem 6.62 is similar to the proof Theorem 6.60, by establishing 
one-to-one correspondence between the partitions, and using the fact that for any 
rectangles J, 

vol (W(J)) = |az---a,| vol (J). 


Chapter 6. Multiple Integrals 464 


Example 6.52 


Find the area of the ellipse 


E = {(z,y) |4(a + 1)? + 9(y — 5)? < 49}. 


Solution 
Make a change of variables u = 2(x + 1) and v = 3(y — 5). The Jacobian 


is 


Therefore, 


area (€) = / dady = i. a) ae 
A(a+1)2+9(y—5)2<49 u2+v2<49 O(u, v) 
1 


49 
= / dudv = —n. 
u2+u2<49 6 


Finally, let us consider an example of applying a general linear transformation. 


Example 6.53 


Evaluate the integral 


[ee d 
dr 
gy 2x — 3y +8 me 


D = {(z,y) |2|x| + 3ly| < 6}. 


Solution 


Notice that for any (x,y) € D, 


|2a — 3y + 8] > 8 — la] — 3ly| > 2. 


Chapter 6. Multiple Integrals 465 


eS 


>X 


Figure 6.37: The transformation u = 2” — 3y and v = 2x + 3y. 


Hence, the function 
20 30S 


es 2x —3y4+8 
is continuous on D. The region D is enclosed by the 4 lines 27 + 3y = 6, 
2x + 3y = —6, 2x — 3y = 6 and 2x — 3y = —6. This prompts us to define 
a change of variables by u = 2% — 3y andv = 2x + 3y. This is a linear 
transformation with Jacobian 


Therefore, 
Oz,y) _ 1 


A(u,v) 12” 
The region D in the (x, y)-plane is mapped to the rectangle 


R = {(u,v)| -6<u<6,-6<v<6} 


in the (u, v)-plane. 


Chapter 6. Multiple Integrals 466 


(x,y) 


Aaa dudv 


L 3 
PT tec 


6.5.2 Polar Coordinates 


Given a point (x, y) in the plane R?, if r is a nonnegative number and @ is a real 
number such that 


L=7 cos, yeorsing, 


then (r, @) are called the polar coordinates of the point (x, y). Notice that 


r= Vere 


Restricted to 
Viste) 00. Oe Or 


the map ®: V > R’, 


®(r,0) = (rcos6,rsin@) 


is one-to-one, and its range is R? \ {(0,0)}. However, the inverse of ® fails to 


be continuous. We can extend the map ® to R? continuously. Namely, given 
(r,0) € R’, let (x, y) = ®(r, 0), where 


L=reosé, y=rsnde. 


Then ® : R? > R? is continuously differentiable, but it fails to be one-to-one. 
Nevertheless, for any real number a, the map is continuous and one-to-one on the 
open set 

Ow = {(r,9) |r >0,a<80<at+2n}. 


Chapter 6. Multiple Integrals 467 


The derivative matrix of ® : R? — R? is given by 


D&(r,4) = HHe —r sin @ . 
sinO rcosé 
Since 


det D®(r,0) = rcos?0+rsin’? 6 =r, 


we find that for any (7,0) € O,, D®(r, 0) is invertible. Hence, ® : O,, > R? is 
a smooth change of variables. 


Figure 6.38: The mapping ® : O > R?’, ®(r,6) = (rcos6,rsin@) maps O = 
{(r,0) |r > 0,0 < 0 < 27} to R? \ L, where L is the positive x-axis. 


Let us consider the special case where a = 0. In this case, let O = Oo. The 
map ® : O > R?’, 


®(r,0) = (rcos6,rsin@) 


is a smooth change of variables from polar coordinates to rectangular coordinates. 
Under this change of variables, 


©(O) = R* \ {(a,0)|x2 > 0} 
is an open set in R?. If D is the open rectangle (71,72) x (01,02), with 
O<r<rg and 0< 6, <6) < 27, 


then ®(D) is the open set bounded between the two circles 7? + y? = r? and 
x2 + yz = 73, and the two rays y = xtan0,, x > Oandy = xrtan 6, x > 0. 


Chapter 6. Multiple Integrals 468 


(a) (b) 


Figure 6.39: The region D = {(rcos@,rsin@) |ri <1 < 12,01 < 6 < 62} in the 
(x, y)-plane. (a) r1 = 0, r2 = 00, 0) = 0, 02 = T. (b) 71 = 0, rg = CO, Oy = =F) 


=e 
6. = 5. 


(b) 


Figure 6.40: The region D = {(rcos@,rsin@) |ry <1 < 12,01 <6 < 62} in the 
(x, y)-plane. (a) r1 = 0, r2 = 00, 6; = 0, 62 = §. (b) 71) = 2,7) = 5,1 = F, 
0, = =. 


To apply the change of variables theorem, we notice that the Jacobian is 


A(x, y) 
O(r, 8) 


=, 


Thus, 


TD) oe 
andy = A(r,8) drdé = rdrdé. 


The change of variables theorem says the following. 


Chapter 6. Multiple Integrals 469 


Theorem 6.63 


Let [a, 6] be a closed interval such that 6 < a+ 27. Assume that g : 
[a, 8] > Rand h : |a, 8] > R are continuous functions satisfying 


0 < g(@) < A(A) for alla <0 < £. 
Let D be the region in the (x, y)-plane given by 


Di {reese rind) |e 207 = Gog) <r =< hig 


If f :© — Ris a continuous function, then 


B ph(@) 
| Jz, vjdzdy — i i f(rcos0,rsin 6)rdrdé. (6.11) 
a a Jg(8) 


U ={(r,8)|a <8 <B,9(8) <r <h(O)}. 


Then U/ is a compact Jordan measurable set in R?. 
If 8 < a+ 27, take any ap such that 


a<a<B<agt2z. 


For example, we can take 


2m — (8 — a) 


Cy) = 6 = 
2 


If we also have 
g(9) > 0 for all 0 € [a, b}, 


then // is contained in the set 
Ora = 18) P= One <9 ag 2m), 


and ®(/) = D. Applying the change of variables theorem to the mapping 


Le (On a IR? gives the desired formula (6.11) immediately. 


Chapter 6. Multiple Integrals 470 


If g(@) = 0 for some 6 € [a, (], then we consider the set 
Ube WRC Ses CON be Si See e Wiese S10), 


It is contained in ©,,. Using boundedness of the continuous functions 
g: la, 8] ~ R,h: |a, 8] > Rand f : D > R, it is easy to show that 


[ face — lm je, Pasay. 


e>0t BU.) 


By the change of variables formula, we have 


B 0)+¢e 
| Vc dco — / i f(rcos0,r sin 6)rdrdé. 
®(U-) a Jg(@)+e 


Taking the « > 0 limit yields again the desired formula (6.11). 

The last case we have to consider is when 6 = a + 27. The technicality is 
that U/ is not contained in any of the O,, restricted to which the mapping 
®(r,6) = (rcos@,rsin@) is a smooth change of variables. Instead of 
taking limits, there is an alternative way to resolve the problem. We write 
U =U, Uly, where 


U, ={(7,0)|a<0<a4+7,9(0) < 
={(r,0)|at+m<0<at2n, 4(0) < 


We have shown that the change of variables formula (6.11) is valid for U4; 
and U4. Apply the additivity theorem, we find that 


| f(x,y)dedy = | f(a,y)dedy + Hi f(e,wdedy 
D w(1) WU (U2) 


h(0 
= if f(rcos6,r sin 0)rdrd0 
9) 


at27 ph(6) 
+ i / f(r cos 6,r sin 0)rdrd0 
atn /g(0) 


B ph() 
= i / f(rcos6,r sin @)rdrdé. 
a J9(8) 


Namely, the formula (6.11) is still valid when 6 = a + 27. 


Chapter 6. Multiple Integrals 47] 


Let us give a geometric explanation for the Jacobian 


=f, wher x=reosd,y=—rsin @. 


Assume that 
6, <0. <6, +27 and O0<7 <1. 


Let 
D = {(rcos6,rsin@)|ry <r < 12,0, < 6 < Oo}. 


The area bounded between the circles 17+ y? = r? and x*+y? = 13 is r(r3—1r?). 
By rotational symmetry of the circle, the area of D is 


A. —6 
AA = n(r2 — 17?) x 2 =FArAd, 
27 
where 7 
Fa Ar =1—-T} and Aé = 65 — 6}. 
When Ar — 0, then 
AAw~rArJé, 
where r ~ 11 ~ 12. 
7 y 
A A 
<a) : or \ 
= — pe 
0 0 


Figure 6.41: The rectangle [71,72] < [61,2] in the (r, @)-plane is mapped to the 
region {(rcos6,rsin@) |r, <r < 12,0, < 0 < 6} in the (x, y)-plane. 


Now let us look at some examples. 


Chapter 6. Multiple Integrals 


472 


Example 6.54 


Evaluate the integral ik (2x? + y?)dxdy, where 
a) 


= {(x,y) |x > 0,y > 0, 4x? + 9y? — 362 


Solution 
Making a change of variables 7 = 3u, y = 2v, the Jacobian is 


where // is the region 
U-1@u) G20 20 so: = ite 


Since U/ is symmetric if we interchange wu and v, we find that 


[ aude = f vaude = 5 f (ue + viduate 
u u 2 Ju 


Using polar coordinates u = r cos 0, v = r sin @, we find that 


2 atl 1 
[we + v*)dudv = | | r? x rdrdo = | ede = ©, 
u o Jo 2 Jo 8 


Therefore, 


4 
[20 + y*)drdy = ee) [ow + v*)dudv = cou 
By) 2 u 4 


Chapter 6. Multiple Integrals 473 


Figure 6.42: The regions D = {(z,y)|z > 0,y > 0,42? + 9y? < 36} andU = 
{(u,v)|u>0,v > 0,u? + v? < 1}. 


Example 6.55 


Find the volume of the solid bounded between the surface z = x? + y? and 
the plane z = 9. 


Solution 
The solid S bounded between the surface z = x” + y? and the plane z = 9 
can be expressed as 


SS 2) Gan) Sar or Se SO, 


where 


BS eg) a ea Oe, 


Since D is a closed ball, it is a Jordan measurable set. The volume of S 
is the integral of the constant function ys : S — R. It is a continuous 
function. By Fubini’s theorem, 


9 
vol (S) = y (| is) dxrdy = fe —2*°—y’)dxdy 
x+y? 


Using polar coordinatets, we have 


20 3 Or2 
vol (S) = i fl (9 — r*)rdrd0 = 2r - 
0 Jo 2 


Chapter 6. Multiple Integrals 474 


Figure 6.43: The solid bounded between the surface z = x? + y? and the plane 
2=%, 


Example 6.56 


Let a be a positive number. Find the volume of ball 


B={(2,y,2) |e +y¥ +2 <a}. 


Solution 
Let 
D={(e,y)|e’+y<o}. 


Then the ball B can be decsribed as 


B= {(2,y,2)|(@,y) €D,- (= fi = SS Vo? = a? — y?} 


Thus, by Fubini’s theorem, its volume is 


Ja?—2?-y? 
vol (B)= f dedya: = f / dz | dxdy 
B D ~4/a2—22—y2 


= a? — x? — y*dady 
D 


Chapter 6. Multiple Integrals 475 


Using polar coordinates, 


20 a a 
vol (B) =2 f | VF rdrad =n f rv a? — r? dr. 
0 Jo 0 


Let u = a? — r?. Then du = —2rdr. When r = 0, u = a?. When r = a, 


u = O. Therefore, 


2 a2 


i“ 2 
vol (B) = an | urdu = 2n Fa 
0 3 Jo 


Example 6.57 


Let a be a positive number, and let a be a number in the interval (0, 5). 


Find the volume of the solid — bounded between the sphere 
S = {(2,y,2) |e? +y? +2? =a"} 
and the cone 
C= { (2,9, 2) |2 = Cot aa/ x +} : 
Solution 
The surfaces S and C intersect at the points (, y, z) satisfying 
(x? + y*)(1 + cot? a) = a?. 


Namely, 


i aeye 0 sin a, 


Therefore, 


E={(0,u2)| (ay) €D,cotaVe ty <z< VP-P yh, 


where 
= { (x,y) jar +y?< a’ sin? a}. 


Chapter 6. Multiple Integrals 


476 


Using Fubini’s theorem and polar coordinates, we find that 


[9242 —y? 
vol (E) = | dxdydz = i, i) dz | dxdy 
E D cot ay/a2+y? 


27 asina 
= | | (v a* — r2 — rcot a) rdrd@ 
o Jo 


Using a change of variables u = a? — r?, we find that 


2 


1 
rv a? — r2dr = >| u2du 


a? cos? a 


asin a 


1 3]@ Le 3 
=; |x! =z (1 = cos a) . 


a2 cos? a 


On the other hand 


asina 37 asina 
2) r 
r* cota dr = cota | — 

0 3 


0 


@ COs oa, 


e 3 
= —— sin” a = —(cosa@ — cos” a). 
3 sina 3 


Therefore, the volume of £ is 


vol (EZ) = 


Figure 6.44: The solid bounded between a sphere and a cone. 


Chapter 6. Multiple Integrals 477 


6.5.3. Spherical Coordinates 


Now we consider the spherical coordinates, which is an alternative coordinate 


system for R*. Consider the mapping W : R® > R?® given by 


W(p, ,0) = (x,y, z) = (psin gcos 6, psin dsin 0, pcos ¢). 
Namely, 
x =psingcosé, 
y = psingsin#g, 
z= pcos®¢. 


Let V be the set 


V ={(p,¢,9)|p>0,0<¢<7,0<60< 27}. 


Given u = (x,y, z) € R® \ {(0,0,0)}, we claim that there is a unique (p, ¢, 0) € 
V such that W(p,¢,0) = (x,y,z). This triple (~,¢,0) is called a spherical 
coordinates of the point u = (x, y, z). It is easy to see that 


p=Vr+y+2 =|lul 
is the distance from the point u = (z, y, z) to the origin. If we let 
¢= cos} a 
p 
@ satisfies 0 < @ < 7, and 


(u,es) = 2 = pcos $ = |lull cos 4. 


Thus, geometrically, ¢ is the angle the vector from 0 to u makes with the positive 
z-axis. Let W be the (z, y)-plane in R?. Then 


(3; Y, 0) = projyu. 
Let 
r= psin @. 
Then 


ra VP Pets = (PF = PTH, 


Chapter 6. Multiple Integrals 478 


ae WN 
ZL! 
; 
F 
F 
/ 
; 


y 
~~ O(x, y, 0) 


Figure 6.45: Spherical Coordinates. 


Thus, @ € |0, 27) is uniquely determined so that 


x=reosd, y=rsindg. 


Equivalently, (r,) is the polar coordinates of the point (x,y) in R? \ {(0,0)}. 
Hence, we find that the map Y : V > R®, 


W(p, 6,9) = (@,y, 2) = (psin 6 cos 8, psin dsin 8, pcos ¢) 


is one-to-one on the set V, and the range is R® \ {(0,0,0)}. However, the inverse 
map is not continuous. 


Let us calculate the derivative matrix of VW : R® — R®. We find that 


singcos@ pcos@cosé —psindsin#d 
DW(p,¢,0) = |sin¢gsind pcosdsind psingcosd 
cos @ —psing 0 


Therefore, the Jacobian det DW(p, ¢, 0) is 


O(a, y, 2) 


Ap, 0,8) = ond xa | 


pcos@cos@ —psin@sin#d 
pcos@sin@ psin@cosé 


+ psing x det 


sing@cosé —psingsin#d 
singsin@ psingcosé 


= p* cos’ dsing + p* sin? d = p’ sing. 


Chapter 6. Multiple Integrals 479 


This shows that DW(p, ¢,0) is invertible if and only if p £ 0 and sing F 0, if 
and only if (x, y, z) does not lie on the z axis. Thus, for any real number a, if O, 
is the open set 


O. = {(p,9,9)|p>0,0<d<T7,a<A0<a+2z}, 


then W : O,. > R? is a smooth change of variables. 


The change of variables theorem gives the following. 


Theorem 6.64 


Let |a, 3] and [6,7] be a closed intervals such that 


Bxa+2nr and 0<d<n< 7, 


and let I = {6,] x [a,@]. Assume that the functions g : I — R and 
h:1-—- R satisfy 0 < g(¢,0) < h(@,6) for all (6,0) € I, let D be the 
region in R® defined by 


D = {(psin ¢ cos 4, psin psin 9, pcos $) | ($,8) € 1, 9(¢,8) <p < h(G,4)}. 


If f : © — R is a continuous function, then 


i jie, uy, 2jandydz 
D 


B pn ph(o,8) 
= Hi i / f(psin ¢cos8, psin dsin 8, pcos d)p* sin ¢ dpdddé. 
a Jd S9(¢,8) 


Again, if 8 < a+ 27,6 > 0,7 < mand g(¢,0) > 0 for all (4,4) € I, this 
is just a direct consequence of the general change of variables theorem. The rest 
can be argued by taking limits. 

The results of Example 6.57 can be used to give a hindsight about the Jacobian 


A(x, y, z) 
O(p, , A) 


that appears in the change from spherical coordinates to rectangular coordinates. 


= p’*sing 


Consider the rectangle I = [p, p2| x [¢1, 62] x [1,02] in the (p,¢,@) space, 
where 62 < 6, + 27, and for simplicity, assume that 0 < ¢; < ¢@2 < 5- Under the 


Chapter 6. Multiple Integrals 480 


mapping 

W(p, 0,0) = (psing cos, psin dsin 9, pcos ¢), 
W (I) is a wedge in the solid EF in R® bounded between the spheres x? + y? + z? = 
p?, 27 + y? + 27 = p3, and the cones z = cot ¢1,/2? + y?, z = cot bg,/ 2? + y?. 


Since FE has a rotational symmetry with respect to 0, 


AV = vol (®(I)) = ao" vol (E). 


Using inclusion and exclusion principle, the result of Example 6.57 gives 


21 2 
vol (EZ) = —p3(1 — cos $2) — =p} (1 — cos ¢2) 


3 3 
QT 21 

= 3 P21 — cos ¢) + 3 Pi(l — cos ¢1) 
20 

= 3 (M2 — p})(cos d1 — cos gy). 


Figure 6.46: Volume change under spherical coordinates. 
By mean value theorem there is ap € (p1, 2) anda @ € (#1, d2) such that 


ps—p?=3p’Ap and cos¢, — cos ¢2 = sin dAd, 


Chapter 6. Multiple Integrals 481 


where 
Ap = pa — pi, Ad = ¢2 — 1. 
Let A@ = 6 — 0. Then we find that 


AV =p’ sin d ApA¢dAD. 


This gives an interpretation of the Jacobian p” sin ¢. 
Let us look at an example of applying spherical coordinates. 


Example 6.58 


Compute the integral ii (x? + 4z)dxdydz, where 
E 


B=1@y2)|a ue te 82 20). 


Solution 
Let B be the sphere 


B= {(z,y,z) |e +y? +2? <9}. 


By symmetry, 


1 1 
| x’*dadydz = | x’dxdydz = = i (2? + y? + 2? \dadydz. 
E 2 JB 6 


B 


Using spherical coordinates, we have 


1 20 T 3 
i x’*dadydz = al ih | p” x p* sin d dpd¢odé 
B 6 Jo Jo Jo 
- 3 


5 
_ Pale 1627 
= =| cos [E] 5 


On the other hand, 


Qn a 3 
| Zondyd2 — | i if pcos @ x p* sin d dpd¢édé 
E 0 Jo Jo 


=e - cos? | 2 
2 Jo 


Chapter 6. Multiple Integrals 482 


Therefore, 


162 567 
| (a? + 4z)dadydz = = + 8la= — 
E 


In the example above, we have used the symmetry of the region FE’ to avoid 
some complicated computations. Another example is the following. 


Example 6.59 


Let a be a positive number. Evaluate the integral i «dadydz, where 
E 


EL = (@y.2)|\220y2s0 2207 47 +2 <a). 


Solution 
The expression of the z variable in terms of the spherical coordinates is 
considerably simpler than the x and y variables. By symmetry, we have 


‘ xdxdydz = | z*dadydz. 
E 


E 


Thus, using spherical coordinates, we find that 


[ stavayae =/- ii p' cos’ x p* sin ¢ dpdodé 
BE 0 Jo Jo 


__ cos? 6g]? 
9 Jo 


6.5.4 Other Examples 
Example 6.60 
Let D be the region 
= ge 0 ye 03 ay 


Compute the integral 


i (a?y + ry?) daxdy. 
D 


Chapter 6. Multiple Integrals 483 


Solution 
Let O = {(z,y)|y > O}, and let © : O — R? be the mapping 


W (x,y) = (@? —y’, ny). 


If (21, y1) and (22, y2) are points in O such that W(x1,y1) = V(x, y2), 


then 


2 2 2 2 
R- YL =TM—YQ and Ly = Toy. 


Let z] = 4, + 7y; and z2 = ©2 + typ. Then 
ay = (a1 + ty)? = (a2 + tye)? = 2. 


This implies that 22 = +2. Thus, y2 = +y;. Since y; and y2 are positive, 
we find that y; = yo. Since r;y, = Voy, we then deduce that 7; = Zo. 


Hence, Y : O > R? is one-to-one. Since it is a polynomial mapping, it is 


continuously differentiable. Since 


2 —2 
i ] , det D(x, y) = 2(a? + y?), 
y 30 


DW (x,y) = | 


we find that det DW(z, y) 4 0 for all (x,y) € O. This implies that © : 
O —> R? is a smooth change of variables. Let u = x? — y?, v = xy. The 


Jacobian is 
= 2(z? + y’). 


Notice that 


Os ae 


Therefore, 


he 1 i, O(u, v) 
sy(a + y*)dedy== | «x dad 
[ ( 2 D "O(a, y) 


1 i 9 
| vdudv = a | vdudv = 50. 
4D) 233 Ja 


Chapter 6. Multiple Integrals 484 


= 


= xX 
7 
Figure 6.47: The region D = {(z,y)|y >0,4<27-y? <9,3 < ay <7}. 


Remark 6.11 Hyperspherical Coordinates 


For any n > 4, the hyperspherical coorfinates in IR” are the coordinates 
(r,01,...,9,-1) such that 

vi —=rcost, 

X2 = rsin 6; cos 42, 


x3 =r sin O; sin A, cos 63, 


In—-1 = 7sin Od; ---sin 6,2 cos On_1 


Ln =rsind,---sind, sin 6,_1. 


— oe 


If 


V = (0,00) x [0,m)""? x [0, 27), 


there is a one-to-one correspondence between (r,01,...,0,-1) in V and 
(%1,...,%p) in R” \ {0} given by (6.12). One can show that the Jacobian 
of this transformation is 


O(x1, Joao ao. c se) 


O(r, 4, wee et) 


=r" sin”? 6, --- sin On_9. 


Chapter 6. Multiple Integrals 485 


Exercises 6.5 


Question 1 


Let 
D = {(z,y) | 4(a + 1)? + 9(y — 2)? < 144}. 


Evaluate the integral | (2x + 3y)dady. 
a) 


Question 2 


Evaluate the integral 
aU 
—_————— drdy, 
i (x — 2y + 8)? i 


Day) |e. 2 |u| = oe 
Question 3 


Evaluate the integral i (a? — xy + y”)dxdy, where 
D 


D={@,y)\\2 2 02> oy? < 36h 


Question 4 

Find the volume of the solid bounded between the surface z = x? + y” and 
the surface x? + y? + 2” = 20. 

Question 5 


Let a, b and c be positive numbers, and let EF be the solid 


2 y? ie 


Evaluate i} x*dxdydz. 
E 


Chapter 6. Multiple Integrals 486 


Question 6 


Let a, b and c be positive numbers, and let EF’ be the solid 


a2 y? a 


Evaluate | x°dxdydz. 
E 


Question 7 


Let D be the region 


a il eo ye 0) 


4 4 
Ob — 
| Y dady. 
ma «rY 


Compute the integral 


Question 8 


Let D be the region 
DS {(x,y) | 5a? — Qry + 10y? < 9}, 
and let © : R? — R? be the mapping defined by 
Wi(z,y) = (Qre4+y, x — 3y). 
(a) Explain why W : R? — R? is a smooth change of variables. 


(b) Find &(D). 


8 
(c) Compute the integral [ 5a? — Dey + 10y? + 16 dady. 


Chapter 6. Multiple Integrals 487 


6.6 Proof of the Change of Variables Theorem 


In this section, we give a complete proof of the change of variables theorem, which 
we restate here. 


Theorem 6.65 The Change of Variables Theorem 


Let O be an open subset of IR”, and let ¥ : O — R” be a smooth change 


of variables. If D is a Jordan measurable set such that its closure D is 


contained in O, then W(®) is Jordan measurable, and for any function 


f : Y(D) — R that is bounded and continuous, we have 


i f(xjdx = | f (¥(x)) |det DW (x)| dx. 
W(2) D 


Among the assertions in the theorem, we will first establish the following. 


Theorem 6.66 


Let O be an open subset of R”, and let & : O — R” be a smooth change 


of variables. If D is a Jordan measurable set such that its closure D is 
contained in O, then Y(D) is also Jordan measurable. 


A special case of the change of variables theorem is when f : ¥(D) —> Ris 


the characteristic function of Y(D). This gives the change of volume theorem. 


Theorem 6.67 The Change of Volume Theorem 


Let O be an open subset of IR”, and let ¥ : O > R” be a smooth change 
of variables. If D is a Jordan measurable set such that its closure D is 
contained in O, then 


vol (W(D)) = ff Jdet DH (x)] ax 


In the following, let us give some remarks about the statements in the theorem, 
and outline the plan of the proof. 


Chapter 6. Multiple Integrals 488 


The Change of Variables Theorem 


1. The first step is to prove Theorem 6.66 which asserts that Y(D) is 
Jordan measurable. To do this, we first show that a smooth change of 
variables sets up a one-to-one correspondence between the open sets in 
the domain and the range. This basically follows from inverse function 


theorem. 


. Since f : Y(D) — R is continuous and bounded, if Y(D) is Jordan 
measurable, f : ¥(D) — R is Riemann integrable. 


. Let g: O > R be the function 


g(x) = | det DU(x)|. 


Since © : O — R” is continuously differentiable, the function DW : 


Dh ais 4 5 a 
O — R” is continuous. Since determinant and absolute value are 


continuous functions, g : O — R is a continuous function. Since D 


is a compact set contained in O, g : D = Ris bounded. 


. Since VW : O — R” is continuous, and the functions f : ¥(D) > 


g : D — Rare continuous and bounded, the function h : D — R, 


f (W(x) |det DY(x)| 
is continuous and bounded. Hence, it is Riemann integrable. 


. To prove the change of variables theorem, we will first prove the change 
of volume theorem. This is the most technical part of the proof. 


. To prove the change of volume theorem, we first consider the case where 


W : R” > R" is an invertible linear transformation. In this case, the 


theorem says that if D is a Jordan measurable set, and T : R” — 
T(x) = Ax is an invertible linear transformation, then 


vol (T(®)) = | det A] vol (D). 


Chapter 6. Multiple Integrals 489 


7. To prove (6.13), we first consider the case where D = [I is a closed 
rectangle. This is an easy consequence of the fact that the volume of 


a parallelepiped spanned by the vectors v;,..., Vn is equal to | det A], 


where A is the matrix with v,,...,V, as column vectors. This was 
proved in Appendix B. 


. After proving the change of volume theorem, we will prove the change 
of variables theorem for the special case where ® = [I is a closed 
rectangle first. The general theorem then follows by some simple 
analysis argument. 


We begin by the following proposition which says that a smooth change of 
variables maps open sets to open sets. 


Proposition 6.68 


Let O be an open subset of R”, and let ¥ : O > R” be a smooth change 


of variables. Then for any open set D that is contained in O, Y(D) is open 


in R”. In particular, &(O) is an open subset of R”. 


Given that D is an open subset of R”, let WW = W(D). We want to show 
that W is an open set. If yo is a point in YW, there is an xo in D such 


that yo = W(x). Since YW : O — R” is continuously differentiable and 


DW (xo) is invertible, we can apply inverse function theorem to conclude 


that there is an open set 2/) containing x9 such that (4p) is also open, and 
W-! : WU) > U is continuously differentiable. Let ¢ = Uy MD. Then 
U is an open subset of D and U4. It follows that V = W(YU) = (W—')-!(Y) 
is an open subset of R” that is contained in W = W(D). Notice that V is 
an open set that contains yo. Thus, we have shown that every point in W 


has a neighbourhood that lies in WV. This proves that VV is an open set. 


The following proposition says that the inverse of a smooth change of variables 
is also a smooth change of variables. 


Chapter 6. Multiple Integrals 490 


Proposition 6.69 


Let O be an open subset of IR”, and let Y : O — R” be a smooth change of 


variables. Then W~' : &©(O) — R” is also a smooth change of variables. 


By Proposition 6.68, &(O) is an open set. By default, ¥~' : ©(O) > R” 
is one-to-one. As in the proof of Proposition 6.68, the inverse function 
theorem implies that it is continuously differentiable. If xo = W~'(yo), 
inverse function theorem says that 


DW" "(y¥o) = DW(xo) *. 


The inverse of an invertible matrix is invertible. Hence, for any yp in Y(O), 
DW '(yo) is invertible. These prove that ¥~! : &(O) —> R” is a smooth 
change of variables. 


Remark 6.12 Homeomorphisms and Diffeomorphisms 


Let O be an open subset of IR”, and let YW : O — R” be a continuous 
injective map such that ©(Q) is open, and the inverse map W~! : (0) > 
O is continuous. Then we say that © : O — W(O) is a homeomorphism. 
A homeomorphism sets up a one-to-one correspondence between open sets 
in O and open sets in Y(O). 

If @ : O — WO) is a homeomorphism and both the maps W : O > 
W(O) and ©! : W/O) > O are continuously differentiable, then we say 
that & : O — W(O) is a diffeomorphism. Proposition 6.68 and Proposition 
6.69 imply that a continuous change of variables is a diffeomorphism. 

A map of the form © : R” > R", 


W(x) = yo + A(x — Xo), 


where Xo and yo are points in R” and A is an invertible matrix, is a 


diffeomorphism. 


Now we can prove the following which is essential for the proof of Theorem 


Chapter 6. Multiple Integrals 49] 


6.66. 


Theorem 6.70 


Assume that O and U are open subsets of R”, and VW : O > Uisa 
homeomorphism. If D is a subset of O such that D is also contained in 
O, then 


int U(D) = W(intD), 


Thus, 


The interior of a set A is an open set that contains all the open set that 
is contained in A. By Remark 6.12, there is a one-to-one correspondence 
between the open sets that are contained in D and the open sets that are 
contained in Y(D). Therefore, 


int U(D) = W(intD). 


Since D is a compact set and W : O > U is continuous, W(D) is a compact 


set. Therefore, Y(D) is a closed set that contains Y(D). This implies that 
U(D) c WD). (6.14) 


Since ©! : U/ > OC is also continuous, the same argument gives 


D=W1(W(9)) c U'(H)). 


This implies that 


WU (D) c UD). 


Eq. (6.14) and (6.15) give 
W(D) = W(D). 


The last assertion follows from the fact that for any set A, A is a disjoint 
union of int A and OA. 


Recall that a set D in R” has Jordan content zero if and only if for every « > 0, 


Chapter 6. Multiple Integrals 492 


® can be covered by finitely many cubes Q,, ..., Qz, such that 


k 
S- vol (Qj) <€. 
j=l 


The next proposition gives a control of the size of the cube under a smooth 
change of variables. 


Proposition 6.71 


Let O be an open subset of IR”, and let Y : O — R” be a smooth change of 


variables. If Q,,, is a cube with center at c and side length 27, then U(Q,,,) 
is contained in the cube Q(c),,,, where 


1<i<n x€Qe,r * 
j=l 


Therefore, 
vol (W(Qe,)) < A” vol (Qe). 


Remark 6.13 


: ; OV; : ; 
Note that since Q.,, is a compact set and an, is continuous for all 1 < 


Os Se 


Proof of Proposition 6.71 
Notice that u € Q,,, if and only if 


lu; —c;| <r foreach 1 <i<n. 


Let d = W(c). Given v = Y(u) with u € Q,,, we want to show that v is 
in Qa.,r or equivalently, 


|v; — d;| < Ar foreach 1 <i <n. 


Chapter 6. Multiple Integrals 493 


This is basically an application of mean value theorem. The set Q,,, is 


convex and the map VW; : O — R is continuously differentiable. Mean 
value theorem says that there is a point x in Q,,, such that 


n= dh = Valu) ~ Bile) = = (x)(uj - ¢9). 


Therefore, 


This proves that Y(Q,,,.) is contained in Qw(c),\r- The last assertion in the 
proposition about the volumes is obvious. 


Now we prove Theorem 6.66. 


Proof of Theorem 6.66 


Since D is a compact set that is contain in the open set O, Theorem 3.36 
says that there is a positive number d and a compact set C’ such that D C 


C’ C O, and any point in R” that has a distance less than d from a point in 
® lies in C. 

Since © : O — R” is continuously differentiable, for all 1 < 7,7 < n, 
OW; 
Ox; 


: C — Ris a continuous function. Since C’ is a compact set, for each 


J 
1<i<_n, the function 


se ene 
a, (x) 


(1 
has a maximum on C’. Hence, 


n 


A = max max (x 
1<i<n xeC “—~ les 
= 


Chapter 6. Multiple Integrals 494 


Since D is Jordan measurable, OD has Jordan content zero. Since C 


contains D, it contains OD. Given € > 0, there exist cubes Q), Qo,..., Qr, 


each of which intersects , and such that 


n k 
OD Cc Ue, and EN (CA. = 

Since a uniformly regular partition of a cube will divide the cube into cubes, 

we can also assume that each of the cubes Q;, 1 < 7 < k has diameter less 

than d. This implies that each Q;, 1 < j < & is contained in C. For 

1<j<k, letl; be the side length of @,;. Proposition 6.71 says that ¥(Q,) 

is contained in a cube Q ; with side length \l,;. Therefore, 


OU(D) = ¥(8D) c LJ Q;, 


k k 
and S— vol (Q;) oe S— vol (CO 
j=l j= 


This shows that OW(®D) has Jordan content zero. Hence, Y(®) is a Jordan 
measurable set. 


To prove the change of volume formula, the crucial thing is to first prove the 
special case where D = Lis arectangle, and W is an invertible linear transformation. 
In Appendix B, we prove the following theorem which gives the volume of a 
parallelepiped. 


Theorem 6.72 


Let FY be a parallelepiped in R” spanned by the linearly independent 


vectors Vj,...,Vn- Then the volume of F is equal to | det A|, where A 


is the matrix whose column vectors are V1,..., Vn- 


We then use this to deduce the following special case of the change of volume 
formula. 


Chapter 6. Multiple Integrals 495 


Theorem 6.73 


Let I be a closed rectangle in R”, and let T : R", T(x) = Ax be an 


invertible linear transformation. Then 


vol (‘T(I)) = | det AJ vol (I). 


Using this, we can prove the more general change of volume formula for a 
Jordan measurable set under an invertible linear transformation. 


Theorem 6.74 


If T: R” > R", T(x) = Ax is an invertible linear transformation, and D 


is a Jordan measurable set, then T(®) is also Jordan measurable and 


vol (T(D)) = | det A| vol (®). 


Since T : R” — R” is invertible, det A 4 0. The fact that the set T(D) is 
Jordan measurable follows from Theorem 6.66. Let I be a closed rectangle 
that contains D. Since D is Jordan measurable, the characteristic function 


Xo : 1— Ris Riemann integrable. 
Given € > 0, there is a partition P of I such that 


E 
U(xp,P) — eae 
(xa, P) Joma 


Hence, 
E 


U P 1 ——_—_., 
(xo, ) < vo (D) ar | det A| 


af ={JE Ip|IND FDO}. 


S 


S— vol (J) = U(x9, P) < vol (D) + Tea 


Jed 
Notice that 


Oe |) AL 


JES 


Chapter 6. Multiple Integrals 496 


Therefore, 


T(D) c LJ TU). 


Jed 


For each rectangle J, T(J) is a parallelepiped. For any two distinct 


rectangles in .&/, they are disjoint or intersect at a set that has Jordan content 
zero. Therefore, additivity theorem implies that the set K defined as 


Be || al 


Jed 


is Jordan measurable, and 


vol (K) = $— vol (T(J)) =| det A] S$ © vol ( J) < | det Al vol (D) +e. 


Jew Jew 


Since T(D) C K, we find that 
vol (T(D)) < vol (kK) < | det A| vol (D) +. 
Since € > 0 is arbitrary, we conclude that 


vol (T(D)) < | det A| vol (D). (6.16) 


Since T~! : R” — R” is also an invertible linear transformation, we find 
that 


vol (D) = vol (T~' (T(D))) 


(6.17) 


< | det A~!| vol (T(D)) = vol (T(®)). 


1 
| det A| 
Eq. (6.16) and (6.17) together give 


vol (T(D)) = | det A| vol (). 


Recall that by identitying an n x n matrix A = [a;,| as a point in R”’, we have 
defined the norm of A as 


Chapter 6. Multiple Integrals 497 


Besides the triangle inequality, this norm also satisfies the following identity. 
Lemma 6.75 


If A = [a;;] and B = [b;;] are n x n matrices, then 


[AB] < |All BI. 


Let |e; = € — AB. Then for any b= 7,5 < 7, 


n 
Cj = ) Geog. 
k=1 


By Cauchy-Schwarz inequality, 


Therefore, 
cl? = Dye 2) | (0%) - = ||A\P||BI. 
j=l geil i=1 l=1 


This proves that 
|| ABI] = [IC] s [AT BI. 


Now we prove the change of volume formula, which is the most technical part. 


Proof of Theorem 6.67 
Given the smooth change of variables ¥ : O — R”, let g : O — R be the 


continuous function 


g(x) = | det DW(x)]. 


We want to show that if D is a Jordan measurable set such that its closure 
® is contained in O, then 


vol (W(D)) = i | det DW(x)|dx = [seoax ey 


Chapter 6. Multiple Integrals 498 


We will first prove that 
vol (W(D)) < Z. 


By Theorem 6.66, Y(®) is Jordan measurable. By Theorem 6.49, its 


closure Y(D) = W(D) is also Jordan measurable, and 
vol (W(D)) = vol (¥@)) = vol (W(D)). 


On the other hand, since D \ D has Jordan content zero, 


[ g(x)dx = i, g(x)dx. 


Hence, we can assume from the beginning that D = D, or equivalently, D 
is closed. 
As in the proof of Theorem 6.66, Theorem 3.36 says that there is a positive 


number d and a compact set C' such that 9 C C' C O, and any point in R” 
that has a distance less than d from a point in D lies in C’. On the compact 


set C’, the function g : C — R is continuous. By extreme value theorem, 


there are points u and v in C such that g(u) < g(x) < g(v) forallx € C. 
Let m, = g(u) and M, = g(v). Then m, > 0 and 


ig Ge) Vie forall x € C. 


On the other hand, the function DW! : C > R”* is continuous on the 


compact set C’. Hence, it is bounded. Namely, there is a positive number 
M), such that 


||DW~*(x)|| <M, for all x € C. 


Let L be a positive number such that C’ is contained in the cube I = 
[—L,L]". Let g : I — R be the zero extension of g : D — R. For 


each positive integer k, let P, be the uniformly regular partition of I into 
k” rectangles. Then jim |P;,| = 0. Therefore, 
00 


lim U(G,Px) = / j(x)dx = [ g(x)ax 


Chapter 6. Multiple Integrals 499 


and 
jim (U(xa,Px) — L(x9, Px) = 0. 


The compactness of C’' implies that the continuous functions DW : C' > 


R”” and g : C > Rare uniformly continuous. Given ¢ > 0, there exists a 
6; > O such that if u and v are points in C' with ||u — v|| < 61, then 


|D¥(u) —DY(v)| <z-— and |g(u) — 9()] < mee. 


Let 6 = min{d, 6,}. There is a positive integer K such that for all k > K, 
|P;,| << é, 


U(g,P,)<Z+e and U(xo, Pr) — Lixo, Px) < 


g 


Consider the partition P = Px. Since I is a cube and P is a uniformly 
regular partition of I, each rectangle in the partition P is also a cube, all 
with the same side length 27. Denote by & and ZF the sets 


@={1E Jp |INDFM}, B={JE Ip\|IcCD}. 


& is a finite collection of cubes, and it contains the collection 4. By 
definition, 


L(xa,P) = 5— vol (J), U(xa,P) = >— vol (J). 
JEB JEL 


Therefore, 


S- vol (J) < = 


JEA\B g 


After renaming, we can assume that 


A=i{Qe\1<SB<s}, 


where Qg = Qe,,r is a cube with center at cg and side length 2r. By the 
definition of 7, 


NE) 10. 


p=1 


Chapter 6. Multiple Integrals 500 


Therefore, 


W(D) c LJ  (Qp). 
p=1 


Fixeda 1 < § < s. Since Qs, intersects D and 
diam Qs — |\P| <0 <a, 


we find that Qs, is contained in C. Define the invertible linear 
transformation Tg : R” > R” by Tg(x) = Agx, where Ag = DW (ca). 
Then let ®3 : Qs — R” be the map ®3 = Allee oW: Qs > R". By chain 
rule, 


D®,(x) = DT3'(Y(x)) D(x) = AZ’ DW (x) forall x € Qg. 
Therefore, for x € Qg, 
D® q(x) — I, =DW(W(cs)) (DY(x) — D¥(cs)). 
Since ||x — cg|| < diam Qg, < 6 < 6, Lemma 6.75 implies that 
|D&,(x) — I < DY "(H(es))|] []D¥Ex) — D¥(es) || < = 
This implies that if 2 4 7, 


for all x € Qs; 


for all x € Qs. 


“| A(®s); 
Ag = max max APs): <1+e. 
1Stgn xeQe OX; 


Since W = Tz o ®g, by Theorem 6.74 and Proposition 6.71, we have 


vol (Y(Qs) = | det Ag| vol (®3(Qz)) < | det Ag| XB vol (Qs) 6 


Chapter 6. Multiple Integrals 501 


Summing over (3, we find that 


vol (W(D)) < Lvl (1+e) >> | det DW(cg)| vol (Q3) . 
B=1 


We divide the sum into a sum over those @, in Z and a sum over those Q, 


in & \ B. For the sum over those in &, we find that 


S~ | det D&(cg)| vol (Qs) < U(G,P) <I +e. 
QeEB 


For the sum over those Q, in & \ Z, 


So | det DW(cg)| vol (Qg) < M 7 vol (Qs) < 
EaA\B 


QpEL\B 


Hence, 
vol (W(D)) < (1+ e)"(Z+4 2e). 


Since < > 0 is arbitrary, taking the limit ¢ + 0*, we find that 
vol (U(D)) <I = i Idet DW (x)| dx. 
D 


This is true for any smooth change of variables © : O — R and any Jordan 
measurable closed subet D that is contained in O. 
Now we want to prove the opposite inequality. First note that the same 


inequality applied to the smooth change of variables U~! : W(O) > R”. 
Thus, if F is a closed Jordan measurable subset of &(O), then 


vol (W~1(F)) < | |det DW-!(u)| du. (6.18) 
F 


Using the same ¢ and partition P as above, for 1 < 6 < s, let 


Fe — Me Sigs: 
Since D and @g are closed Jordan measurable sets, D M Cg is also a closed 
Jordan measurable set, and so is Fg. Additivity theorem implies that 


s 


= i g(x)dx = > [ ot 


ei 


Chapter 6. Multiple Integrals 502 


For each 1 < 6 < s, since DM Qz, is compact, there is a point vg € DNQs 
such that 
g(x) < g(vz) for allx € DN Qs. 


This gives 

J glee < ava) vot (9.9 Qp) = glva) vol (WF). 

DNQ~s 
By (6.18), we find that 
| g(x)dx < g(vz) |det DW" (u)| du. 
DNQ~s Fp 

Again, there is a point wg € DM Qs, such that 

|det DW~'(u)| < det DW (Y(wa))| for all u € Fy. 
This implies that 

i, |det DW~!(u)| du < |det DW(w,))| * vol (Fs). 
Fp 


Hence, 


g(vg) — ee) . 


g(wa) 


[ i g(x)dx < oe vol (Fs) = vol (F,) (1 4 


Now since vg and wg are in Qa, ||vg — wa|| < 61. Thus, 


g(wa) = = lg(va) — g(wa)| <e. 


ee = 9(wa)| - 
This gives 
| g(x)dx < (1+) vol (U@(DNQ,)): 
DNQ~a 


Summing over ( and using additivity theorem, we find that 


t= | g(x)dx < (1 +e) 5 vol (# (DN Qz)) = (1+) vol (W(D)). 
9 eh 


Chapter 6. Multiple Integrals 503 


Taking < — 07 gives the desired inequality 


T < vol (W(D)). 


This completes the proof of the change of volume theorem. 


To conclude the proof of the change of variables theorem, we need the following 


generalization of the mean value theorem for integrals. 


Theorem 6.76 Generalized Mean Value Theorem for Integrals 


Let 9 be a compact Jordan measurable set, and let f : D > Rand g : 


© — > R be continuous functions. If 9 is connected or path-connected, and 


g(x) > 0 for all x € 9, then there is a point xo in D such that 


The proof of this generalized mean value theorem is almost the same as the 
mean value theorem. The latter can be considered as the special case where 
g(x) = 1 for allx in D. 


Since D is compact and f : D — R is continuous, extreme value theorem 


asserts that there exist points u and v in D such that 
THON eas Gy) for allx € D. 
Since g(x) > 0 for all x € D, we find that 


F(a)g(x) < f(x)g(x) < f(v)g(x) for allx € 9. 


The monotonicity theorem implies that 


f(a) [ g(x)dx < i f(x)g(x)dx < flv) i g(x)dx. 


Chapter 6. Multiple Integrals 504 


u= f a(xex 


If U = 0, we can take x, to be any point in D. If U ¥ 0, notice that 


il 
c= Ff Healer 


satisfies 
HU Se f(y). 


As in the proof of the mean value theorem, D is connected or path- 
connected allows us to conclude that there is an xg in D such that 


1 


7 ff ea(e)ax = © = Foo). 


This gives 
f(x)g(x)dx = f(x) i g(x)dx. 


D D 


Next, we prove the special case of the change of variables theorem when ®D is 
a closed rectangle. 


Theorem 6.77 


Let O be an open subset of R”, and let & : O > R” be a smooth change 


of variables. If I is a closed rectangle contained in O, and f : Y(I) > Ris 


continuous, then 


Hf (dsc — fi (W(x)) |det DW(x)| dx. 


wv (1) 


It is sufficient to show that for any < > 0, 


| (NCO Iaks = fi (W(x)) |det DW(x)| dx] < e. 
W (1) I 


Chapter 6. Multiple Integrals 505 


Since f : Y(I) — Rand © : I > R" are continuous, (fo W) : I > Ris 
continuous. Since I is compact, (f o W) : I > R is uniformly continuous. 


Given € > 0, there is ad > 0 such that if u and v are points in I, 


S 


IF(W(u)) — FCG) < vol (@()) 


Let P be a partition of I such that |P| < 6. By additivity theorem, 
the Heec= Se he f(x 
JETp 


and 


[[f(29) ter D¥ (0) x= fF (G0) jaet DHLO}| dx 


JESp 


Each J € Jp is a closed rectangle. Hence, it is compact and path- 


connected. Since YW : I > R” is continuous, (I) is also compact and 


path-conneted. By the generalized mean value theorem, for each J € Jp, 
there exist uy and vj in J such that 


[ 4, {C0 = FH(as)) vol (HC) 


as )) |det DW (x)| dx = f(W(vz)) [ der D¥(0| ax 


By the change of volume theorem, 
i |det DW (x)| dx = vol (W(J)). 
J 


Since ||uy — vg|| < diam J < 6, we find that 


S 


|f(W(us)) — f(¥(vs))| < vol (W(D): 


Chapter 6. Multiple Integrals 506 


It follows that 


Ne xjarc— | F(x )) |det DW (x)| dx 
w(1) 


mlb f(x Jax — ff (Hx )) |det DW (x)| dx 


- Jedp 
= N° |f(Y(us)) — f(G(vs))| vol (W(J)) 

eee 

< vol (@(D) Dp) 2 vol ray x vol (W(I)) =e. 


era 


This completes the proof. 


Finally, we conclude the general case. 


Conclusion of the Proof of the Change of Variables Theorem 


As in the special case, it is sufficient to show that for any < > 0, 
ox) — fh F(x )) |det DW (x)| dx} <e. 


Let us first proceed as in the proof of change of volume theorem. There 
is a positive number d and a compact set C' such that D C C C O, and 


any point in R” that has a distance less than d from a point in D lies in C. 
On the compact set C, the function g : C > R, g(x) = | det DU(x)| is 
continuous. Therefore, it is bounded. Namely, there is a positive number 
M, so that 


|\det DW(x)| < M, for allx € C. 


Since we assume that the function f : ¥() — R is bounded, there is a 
positive number /; such that 


| f(W(x))| < My for allx € D. 


Chapter 6. Multiple Integrals 507 


Let M = M;M,, and let I be a rectangle that contains D. Given < > 0, let 
P be a partition of I such that |P| < d and 


L(x9,P) > vol (®) 


@ ={JE Jp|INDFD, B={JEIJp|IcD}. 


Since |P| < d, each J in & is contained in C’. Moreover, 


L(xa,P = vol (J 


JEB 


O= | a 
JEB 


Then Q is a compact subset of D, S = D \ Q is Jordan measurable, and 


Denote by Q the set 


vol(D \ Q) = vol (D) — 


By additivitity theorem, 


[fOe= Xf toot ffs 


JEB 


[/ £00) eet DUG |ax = Sf £(H() det DUC) ax 
D 


JEB 


ts f (¥(x)) |det DW (x)| dx. 
D\Q 


Theorem 6.77 says that for each J in &, 


los f(x jax = f FC )) |det DW (x)| dx. 


Chapter 6. Multiple Integrals 508 


Therefore, 
(xd — ff Fw )) |det DW (x)| dx 


f(x)dx| + f (W(x)) |det DW (x)| dx] . 
W(D\Q) D\Q 


For the term a (x)dx, we have 
ee 


f(x)dx] < M; vol (@(D \ Q)). 
W(D\Q) 


By the change of volume theorem, 


vol (B(D \ Q)) = lee Idet D(x)| dx < M,vol (D \ Q). 


Therefore, 
f(x)dx| < M;M,vol (D \ Q) = Mvol (® \ Q) < = 
¥(D\9) 2 
Similarly, 
f (W(x) |det DW (x)| dx] < Mvol (D \ Q) < =. 
D\Q 


This gives 
(xd — fF (w(x )) |det DW (x)| dx} <e, 


which completes the proof. 


Chapter 6. Multiple Integrals 509 


6.7 Some Important Integrals and Their Applications 


Up to now we have only discussed multiple integrals for bounded functions f : 


 — R defined on bounded domains. For practical applications, we need to 


consider improper integrals where the function is not bounded or the domain 
is not bounded. As in the single variable case, we need to take limits. In the 
multi-variable case, things become considerably more complicated. Interested 
readers can read the corresponding sections in the book [Zor16]. In this section, 
we use theories learned in multiple integrals to derive some explicit formulas of 
improper integrals of single-variable functions, without introducing the definition 


of improper multiple integrals. We then give some applications of these formulas. 


Proposition 6.78 


For any positive number a, 


Since the function f : R > R, f(x) = e~**” is positive for all x € R, 


oa 2 2 2 
/ e ™ dx = lim e * dz. 
ae Loo J_p 


Given a positive number R, we consider the double integral 


in= [ e +9") drdy. 
B(0,R) 


For any positive number L, 


BO) 6 [=L Li x|-L.0) © BO, 21), 


Since the function g : R? > R, g(x) = e~a(2"+9") is positive, 


Ng / eo O'Wdrdy < pee 
[-L,L]x[-L,L] 


Chapter 6. Multiple Integrals 510 


Using polar coordinates, we find that 


2 


Dye R : ea fi 1 : 
liz = | | eo rdrde = 27 | — == (1 eee ) : 
0 Jo 2a , 2 


a 


Eq. (6.19) then implies that 


lim et dedy = iS 
Loo Ji_1,L)x[-L,L] a 


By Fubini’s theorem, 


fancuge ene (f 
2h ite = 


Thus, we conclude that 


pS 2 £ 2 a 
a ake = lhnan @ the = ail =. 
eee Loo J_yp a 


Figure 6.48: B(0,L) c [-L,L] x [-L,L] c B(0,V2L). 


The improper integral / e~ dx with a > 0 plays an important role in 


—Co 
various areas of mathematics. For example, in probability theorem, the probability 


Chapter 6. Multiple Integrals 511 


density function of a normal random variable with mean j: and standard deviation 


a is given by 


The normalization factor 1/(./27c) is required such that 


[tea = 1, 


which ensures that total probability is 1. 
Recall that we have defined the gamma function I'(s) for a real number s > 0 
by the improper integral 


T(s)= ia i te “dt: 
0 
The value of I'(1) is easy to compute. 
sage [ é“di=1. 
0 
Using integration by parts, one can show that 
I'(s +1) = sI(s) when s > 0. (6.20) 
From this, we find that 
[(n+1) =n! for alln € Z*. 
The value of ['(s) when s = 1/2 is also of particular interest. 


Theorem 6.79 


The value of the gamma function ['(s) at s = 1/2 is 


By Proposition 6.78, 


oo L 
Jt = 2 | e dr =2 lim lim edz. 
0 


a—0+ Loo J, 


Chapter 6. Multiple Integrals 512 


Making a change of variables t = x”, we find that 


L , L? ; 
2 | e” dx =/| Ta2en di 
a a2 


Therefore, 


Another useful formula we have mentioned in volume I is the formula for the 
beta function B(a, 3) defined as 


1 
Bia, 8) = | mr ae when a > 0,6 > 0. 
0 
It is easy to show that the integral is indeed convergent when a and £ are positive. 
We have the following recursive formula. 
Lemma 6.80 


For a > 0 and § > 0, we have 


(a+ 8+1)(a4+ 6) 
ap 


Bia +1,8 +1). 


First notice that for a > 0 and 6 > 0, 


1 1 
i e-1(1 — 1)? lat = | PML i Lh eb el 
0 a 1 
-{ era —1ear+ f (1 — t)P "de. 
0 0 


This gives 
Bia, 8) = Bla + 1,8) + Bla, 8 + 1). 


Chapter 6. Multiple Integrals 513 


Apply this formula again to the two terms on the right, we find that 


Bia, 8) = B(at+ 2,6) +2B(a4+1,64+1)+ Bla, 6+ 2). 


Using integration by parts, one can show that when a > 0 and 6 > 0, 


1 1 
a eo = a a=1 Ve 
/ Seah al ie Cee eat 


0 
This gives 
B(a + 1,8) = GB(a,8 +1), 


Therefore, 


B(a, 8) = (424+ F2*) B(a+1,6+1) 
Dear ea 8) 
= aB Bia+1,8+4+1). 


Now we can derive the explicit formula for the beta function. 


Theorem 6.81 The Beta Function 


For any positive real numbers a and £3, 


B(a, 8) = / a ete ere 


We first consider the case where a > 1 and 6 > 1. Letg : 
[0, co) — R be the function defined as 


Givi wa ue. 


This is a continuous function. For L > 0, let 


CEs TE) |LOSS cea) 
{(u,v)|w>0,v>0,u+u< LD}. 


Chapter 6. Multiple Integrals 514 


Consider the map W : R? — R? defined as 
(u,v) im W(t, w) om (tw, (1 a t) ): 


Notice that Y maps the interior of ¢/; one-to-one onto the interior of D;. 
The Jacobian of this map is 


Olu, vu) _ w 24 
O(t,w) ue ee 1 u a 


Thus, W : (0,1) x (0,00) — R? is a smooth change of variables. By taking 


limits, the change of variables theorem implies that 


a he O(u, v) ee ae 
[ W)(t, aa ae [« ,v)dudv. (6.21) 


? 


Now Fubini’s theorem says that 


O(u, v) = 5 a-1 B-1 7 a+B-1,—w 
[ oomtms er ataw = fi (1 —t) ar f w e “dw. 


Thus, 


ae (Geaca yet ee 


phe Aan = T(a + 8) B(a, 8). 


On the other hand, we notice that 


& 2 
0.5 Gare euuen 


Since the function g : [0,00) x [0,co) > R is nonnegative, this implies 
that 


oe < | g(u, v)dudu < Ty, (6.22) 
2 Ale 


where 


if iE 
i =| g(u,v)dude = | utetdu [ ye te-" dy. 
[0,L]? 0 0 


Chapter 6. Multiple Integrals 515 


By the definition of the gamma function, 


hie hy, rea 


L-oo 


Eq. (6.22) then implies that 


lim g(u, v)dudv = T(a)I(B). 


LI- oo Dr 


It follows from (6.21) that 
T(a+ 8)B(a, B) =T(a)P(8), 


which gives the desired formula when a > 1 and 6 > 1. 
For the general case where a > 0 and 6 > 0, Lemma 6.80 and (6.20) give 


Heo (aise sr (Grae EVI erse Ie se) ay a) 
ae aB Tats 2y 9 ras) 


This completes the proof. 


L172 
Figure 6.49: 0.5 Cc {(u,v)|u>0,v>0,utv<L} c [0,L)%. 


Now we give an interesting application of the formula of the beta function. 


Chapter 6. Multiple Integrals 516 


Theorem 6.82 


For n > 1, the volume of the n-ball of radius a, 


Be) = Atti, tae 


is equal to 


It is easy to see that for any a > 0, 
VG) = Va", where V,, = V,(1): 


Since B;(1) = |—1, 1], we find that V; = 2. For n > 2, notice that for fixed 
—1<y< 1, the ball B,,(1) intersects the plane x,, = y on the set 


Sa) = {Gaye et) | oie eee Dy fe 


Fubini’s theorem implies that 


1 


Gy. 2602, 100 — i (l-y )"F Vp_sdy 
VJ1-9") ai 
1 
= Yar f P20 Sree 
0 


r (234) 
=) 


n 
12 
a 


Chapter 7. Fourier Series and Fourier Transforms 517 


Chapter 7 
Fourier Series and Fourier Transforms 


In this chapter, we shift our attention to the theory of Fourier series and Fourier 
transforms. In volume I, we have considered expansions of functions as power 
series, which are limits of polynomials. In this chapter, we consider expansions of 
functions in another class of infinitely differentiable functions — the trigonometric 
functions sin x and cosx. The reason to consider sin x and cos x is that they are 


representative of periodic functions. 


Recall that a function f : R — R is said to be periodic if there is a positive 


number p so that 


f(et+p) = f(z) forall z € R. 


Such a number p is called a period of the function f. If p is a period of f, then for 
any positive integer n, np is also a period of /. 


The functions sin x and cos x are periodic functions of period 27. If f : R > 


R is a periodic function of period p = 2L, then the function g : R — R defined as 


Tx 
ae) = (7) 
is periodic of period 27. Hence, we can concentrate on functions that are periodic 
of period 27. 
The celebrated Euler formula 
e” = cosa +isina 
connects the trigonometric functions sin, cosx with the exponential function 


with imaginary arguments. Hence, in this chapter, we shift our paradigm and 


consider complex-valued functions f : D — C defined on a subset D of R. 


Since a complex number z = x + zy with real part x and imaginary part y can be 


identified with the point (7, y) in R?, such a function can be regarded as a function 


f : D — R’, so that derivative and integrals are defined componentwise. More 


Chapter 7. Fourier Series and Fourier Transforms 518 


precisely, given x € D, we write f(x) = u(x) + iv(x), where u(x) and v(x) are 
respectively the real part and imaginary part of f(a). If xo is an interior point of 
D, we say that f is differentiable at xo if the limit 


fam £02) = fo) 


LX xL— Xo 


exists. This is if and only if both u: D Rand v: D — KR are differentiable at 


Xo, and we have 
f' (0) = u'(ap) + tv'(z0). 
Similarly, if [a,b] is a closed interval that is contained in D, we say that f is 


Riemann integrable over {a, b] if and only if both u and v are Riemann integrable 
over [a, b], and we have 


on = [ weyae +i [ v(oyce 


If F : [a,b] — C is a continuously differentiable function, the fundamental 
theorem of calculus implies that 


F(b) — F(a) =i F'(x)dz. 


7.1 Orthogonal Systems of Functions and Fourier Series 


In the following, let J = [a,b] be a compact interval in R unless otherwise 
specified. Denote by R(J, C) the set of all complex-valued functions f : J + C 
that are Riemann integrable. Given two functions f and g in R(J, C), their sum 
f + is the function (f +g): IC, 


(f + 9)(x) = fle) + g(@). 


If a is a complex number, the scalar product of a with f is the function (af) : 
I — C, where 


(af)(x) = af(@). 
With the addition and scalar multiplication thus defined, R(J,C) is a complex 


vector space. From the theory of integration, we know that the set of complex- 
valued continuous functions on J, denoted by C'(/, C), is a subspace of R(J, C). 


Chapter 7. Fourier Series and Fourier Transforms 519 


If f : J + C is Riemann integrable, so does its complex conjugate f : J > C 
defined as 


F(x) = F(2). 


In volume I, we have proved that if two real-valued functions f : J — R and 


g : I — R are Riemann integrable, so is their product (fg) : J > R. Using 


this, it is easy to check that if f : J — C and g : I + C are Riemann integrable 
complex-valued functions, (fg) : J + C is also Riemann integrable. 


Proposition 7.1 


Given f and g in R(J, C), define 


tf.) = f f(x)g(x)de. 


For any f,g, hin R(J,C), and any complex numbers a and (3, we have the 
followings. 


(a) (9, f) = (f,9)- 
(b) (af + Bg,h) =a(f,h) + B(g,h). 
(Ca a) 


We call (- , - ) a positive semi-definite inner product on R(I, C). 


It follows from (a) and (b) that 


(f,ag + Bh) = a f,g) + BUF, h). 


More generally, we have 
(Sav > Sa: a.) 
i=1 j=l 


i=1 j=l 


If f(x) = u(x) + iv(ax), where u(x) = Re f(x) and v(x) = Im f(z), then 


uf=f (u(x)? + v(x)°) de. 


Chapter 7. Fourier Series and Fourier Transforms 520 


Notice that (f, f) = 0 does not imply that f = 0. For example, take any c in 
[a, b], and define the function f : J + C by 


1, if z=, 
f(z) = 
0, otherwise. 
Then ‘ 
(nf)= fs @)F wae =0 
even though f is not the zero function. This is why we call (- , - ) a positive 


semi-definite inner product. Restricted to the subspace of continuous functions 
C(I,C), (+, - ) is a positive definite inner product, or simply an inner product in 
the usual sense. 

Using the positive semi-definite inner product, we can define a semi-norm on 
R(I,C). 


Definition 7.1 The L? Semi-Norm 
Given f : 1 > Cin RU, C), the semi-norm of f is defined as 


FigVGn— / Ife) Pde. 


It has the following properties. 


Proposition 7.2 


Given f in R(/,C) and a € C, we have the followings. 


(a) || f|| 2 0. 
(b) laf ll = alll fll. 


The Cauchy-Schwarz inequality still holds for the positive semi-definite inner 
product on R(I, C). 


Chapter 7. Fourier Series and Fourier Transforms 521 


Proposition 7.3 Cauchy-Schwarz Inequality 


Given f and gin R(I,C), 


\(f.9)| < I FINI 


The proof is exactly the same as for an inner product on a real vector space. An 
immediate consequence of the Cauchy-Schwarz inequality is the triangle inequality. 


Proposition 7.4 Triangle Inequality 


Let fi,...,fn be functions in R(J,C) and let aj,...,a, be complex 
numbers. We have 


loaf +--+ Onfall S loalll fll +--+ + lonlll fall. 


The proof is also the same as for a real inner product. One consider the case 
n = 2 first and then prove the general case by induction on n. 

We can define orthogonality on R(/,C) the same way as for a real inner 
product space. 


Definition 7.2 Orthogonality 

Given two functions f and g in R(/,C), we say that they are orthogonal if 
(f,9) = 0. 

Example 7.1 

Let I = [0,27]. For n € Z, define ¢, : I — C by ¢,(xz) = e*"*. Show that 


if m and n are distinct integers, then ¢,,, and @,, are orthogonal. 


Solution 
Notice that 


20 20 ; 


Chapter 7. Fourier Series and Fourier Transforms 522 


Since m # n, and 


fundamental theorem of calculus implies that 


i(m—n)x 727 
0 


i(m —n) 


Hence, @,,, and @,, are orthogonal. 


Definition 7.3 Orthogonal System and Orthonormal System 


Let S = {¢,|a € J} be a subset of functions in R(J,C) indexed by the 
set J. We say that S is an orthogonal system of functions if 


(Ga, bg) = 0 whenever a # 8, 


and 
\|dal| 4 0 for alla € J. 


We say that S is an orthonormal system of functions if it is an orthogonal 
system and 
Pall = 1 for all a € J. 


Notice that in our definition of orthogonal system, we have an additional 
condition that each element in the set S cannot have zero norm. By definition, 
it is obvious that if S is an orthogonal system, then any subset of S is also an 
orthogonal system. The same holds for orthonormal systems. 


Example 7.2 


Let I = [0, 27]. For n € Z, define ¢, : I + C by ¢,(x) = e’””. Then 


20 
lon ||? = | ee de = Dr. 
0 


Example 7.1 implies that S = {@,|n € Z} is an orthogonal system. 


Chapter 7. Fourier Series and Fourier Transforms 523 


If we let y, : J — R, n € Z be the function 


~ Wbnll 2m’ 


then S = {y,,|n € Z} is an orthonormal system. 


Pn(2) 


Using the semi-norm, we can define a relation ~ on R(J, C) in the following 
way. We say that f ~ g if and only if || — g|| = 0. It is easy to check that this 
is an equivalence relation. Reflexivity and symmetry are obvious. For transitivity, 
we note that if f ~ g and g ~ h, then || f — g|| = 0 and ||g — hl] = 0. It follows 
from triangle inequality that 


lf — All < IF — gll + llg — Al] = 0. 


This implies that || — h|| = 0, and thus f ~ h. Hence, ~ is an equivalence 
relation on R(I, C), which we call L?-equivalent. 


Definition 7.4 L? Equivalent Functions 


Two Riemann integrable functions f : J > C andg : I > C are L?- 
equivalent if 


lf — gl =9. 


Example 7.3 


Let J = [a,b], and let S be a finite subset of J. If f : J > Candg: I > C 
are two Riemann integrable functions and 


(2) — aa) for all x € [a,b] \ S, 


then f and g are L?-equivalent. 


Regarding R(J,C) as an additive group, the subset K-(/, C) that contains all 
the functions in R(J,C) that have zero norm is a normal subgroup. They are 
functions that are L?-equivalent to the zero function. Denote by 


a) 


(EOSRLO KO = RC « 


Chapter 7. Fourier Series and Fourier Transforms 524 


the quotient group. Then each element of R(I ,C) is an L? equivalent class of 
functions. 
If u is in K(I,C), g is in R(J,C), then Cauchy-Schwarz inequality implies 
that 
(us 9)] S [lel llgll = 0. 


Thus, (u,g) = 0. If f is L? equivalent to f,, g is L? equivalent to gj, there exists 
u and v in K(I,C) such that f; = f + u and g; = g + v. Therefore, 


(fir gi) = (f +u,g +) = (fg) + (us 9) + (f,0) + (u, 0) = (Ff, 9)- 


Hence, the positive semi-definite inner product (- , - ) on R(J,C) induces an 
infinite product on R(I, C) by 


(fl, [g]) = (f,9)- 


a~ 


If |f] € RU, C) is such that 


(Lf [Fl = (ff) = 0, 


then f is in (J, C), and thus, [f] = [0]. This says that the infinite product (- , - ) 
on R(I ,C) is positive definite. The additional condition we impose on a subset S 
of R(I,C) to be an orthogonal system just means that none of the elements in S 
is L?-equivalent to the zero function. 

For an orthogonal system, we have the following from (7.1). 


Theorem 7.5 Generalized Pythagoras Theorem 


Let S = {¢,%| 1 < k < n} be an orthogonal system of functions in R(J, C). 


For any complex numbers a,..., Qn, 


llorgi +++ + Ondall? = lanl? lldrll? +--+ + fond? Ilda’. 


The functions ¢,, : [0,27] > C, f,(x) = e'"*, n € Z are easy to deal with 
because of —e“* = ae“ for any complex numbers a. The drawback is they are 


a 
complex-valued functions. Since 


e”” = cosnz +isinnz, 


Chapter 7. Fourier Series and Fourier Transforms 525 


if one wants to work with real-valued functions, one should consider the functions 


cos nx and sin nx. 


Proposition 7.6 


Let J = [0, 27], and define the functions C,, : J > R,n > 0, and S, : I > 
R,n > 1 by 


C2) —Cosna. ya —Sunriaes 
Then 42 = {C,|n >0} U {S,|n > 1} is an orthogonal system, and 


||Coll = v2z, 


Col] =||Sell = Vr whenn > 1. 


For n € Z, let dn : [0,27] > C be the function ¢,(x) = e'”*. 
Co = ¢o, and when n € Z*, 


_ Ont Gn S _ Pn — Pn 


Cc, ‘ 
2 24 


Since {¢,,|n € Z} is an orthogonal system, we find that for n € Z, 


(Co, On) = 5 (dos bn) + 5 (dos bon) = 0, 


(Co, Sx) = 5(0:4n) — 5 (60,6-n) = 0 


For m,n € Z* such that m # n, 
(Cm; Cn) = (bm, bn) ae (bm, b-n) a (d-m, bn) a (Om, b-n)) = 


(Sin, Sn) = 4 ((bm, bn) ad (Om, bn) a (Pim bn) aE (P-m, b-n)) aa 


For m,n € Z™, considering the cases m = n and m # n separately, we 
find that 


(Sin; Cn) = 5 (ms Bn) + (Brus Pn) = (Bm: Bn) = (Bm Bn)) = 0 


Chapter 7. Fourier Series and Fourier Transforms 526 


These show that # is an orthogonal system. For n € Z*, since ||¢,,|| = 
\||o_n|| = Vv 27, and ¢, and ¢_,, are orthogonal, we have 


2) 


iE i 1 1 
[Cal? = | 50+ 54a] = Glial? + Fla? = 


w 1 See el ; 


Salle = 
ISall? = || =-¢n - = 
These complete the proof. 


Given a finite subset S = {fi,..., f, } of RU, C), let 
Ws = spanS = {afi +---+¢nfn|ci,.--,¢n € C} 


be the subspace of R(J,C) spanned by S. We say that an element g of R(J, C) 
is orthogonal to Wg if it is orthogonal to each f € Ws. This is if and only if g is 
orthogonal to f; for all 1 < k < n. The projection theorem says the following. 


Theorem 7.7 Projection Theorem 


Let S = {1,...,¢n} be an orthogonal system of functions in R(J,C), 
and let Ws be the subspace of R(/,C) spanned by S. Given f in R(/,C), 
there is a unique g € Wg such that f — g is orthogonal to Ws. It is called 
the projection of the function f onto the subspace Ws, denoted by projy,, f, 
and it is given by 


(fe) 4 (f, ¢1) en (f, On) 
Proj. = 3s (Gk, a ~ (bi, 1) a ag 2 Oy Pn) Pn 


For any h € Ws, 


If — All 2 If — proj, fl. 


Assume that g is a function in Ws such that f — g is orthogonal to Ws. 
Then there exist complex numbers a, ..., @,, such that 


J = 91 +++ + Ondn. 


Chapter 7. Fourier Series and Fourier Transforms 527 


Since (¢,, ¢,) = Oif k # 1, we find that 


(9, Pe) = On (br, Pe) ford) <hsan: 


Since f — g is orthogonal to Ws, (f — g,¢,) = 0 for all 1 < k < n. This 
gives (f, dx) = (g, or), and thus 


On (Pk, Pe) — Ob Pk) for 1 << k <S We 


Hence, we must have 


(f, Ok) 
(es Pe) 


This implies the uniqueness of g if it exists. It is easy to check that the 


Ak = 


function 


(FP) 4 
1= 3 Ts 


is indeed a function in Wg such that f — g is orthogonal to Ws. 


Finally, for any h in Ws, g — his also in Wg. Hence, g — h is orthogonal 
to f — g. By the generalized Pythogoras theorem, 


If — All? =F —9) + @— ANP = IF — oll? + Ilo — All? = IF — all’. 
This proves that 


Eee ees ee) for all h € Ws. 


Now we restrict our consideration to functions f that are periodic of period 27. 
In this case, the function is uniquely determined by its values on an interval |a, }] 
of length 27. We often take J = [0,27] or I = [—7, m1]. Notice that if f : RC 
is a function of period 27, then for any a € R, 


—_— war = f° f(a jae =f fla)de 


Any function f : [a,a + 27] — C defined on an interval of length 27 can be 


extended to be a 27-periodic function. 


Chapter 7. Fourier Series and Fourier Transforms 528 


Definition 7.5 Extension of Functions 


Let J = [a,a + 27m] be an inverval of length 27, and let f : I > C be 
a function defined on J. We can extend f to be a 27-periodic function 


f : R- C in the following way. 


(i) For x € (a, a+ 27), define 


f(@+ 2nz) = f(a) for all n € Z. 


(ii) For x = a, define 


Fla a 2nm) = f(a) + ae + 27) 


for all n € Z. 


Examples are shown in Figures 7.1 and 7.2. 


A 


Figure 7.1: Extending a function f : [—7, 7] — R periodically. 


oa 


Figure 7.2: Extending a function f : [—7, 7] — R periodically. 


Now let us define Fourier series. Example 7.2 asserts that the set 


S = {¢n|n€Z}, where d,(x) = e'”*, 


Chapter 7. Fourier Series and Fourier Transforms 529 


is an orthogonal system of functions in R(J,C), where J = |—7, 7]. Forn > 0, 
let W,, be the subspace of R(I,C) spanned by S,, = {e**| —n<k <n}. Itis 
a vector space of dimension 2n + 1 with basis S,,. Moreover, 


Wo CW, CW. Cc 
A real basis of W,, is given by 
6, = {sin ke | k= 1yacng} Ufeos ke | k=O, Lice, 2}: 
Given f € RU,C), let s, = projy, f be the projection of f onto W,,. The 


projection theorem says that 


Sn(x) = (projy,, f) (7) = >», cue 


k=—n 


on Pr) e tke 
oe (dks bk) ~ On a. Ha - 


By Proposition 7.6 and the projection theorem, s,,(x) can also be written as 


where 


> - du (a, cos kx + by sin kar) , 


Sn(a) = (projy. f) (z) = 


where Li; 
a - | f(x) cos kadz, for0<k<n, 
T Jog 


= -{ f(x) sin kadx forl<k<n. 


By definition, we find that 


ao 
Co = 9” 
and when k > 1, 
ag + iby ay — iby 
C__ = Ch = 


a ie 2 


If f is a real-valued function, a, and b, are real and c_;, = CG. 


Chapter 7. Fourier Series and Fourier Transforms 530 


[oe] 
Since a trigonometric series can be expressed in the form y cpe'**, we also 


k=—0o 
fore) 


call a series of the form y c,e'** a trigonometric series. Fourier series of a 


k=—00 
function is a trigonometric series. 


Definition 7.7 Fourier Series and its n™ Partial Sums 


Let J = [—7, 7]. The Fourier series of a function f in (J, C) is the infinite 
series 


ss cre Or = + (a, coskax + by sin kx) , 


1 


oo 
= 


k=—00 


hy 

= [ fee*as, 
1 AE 

= -{ f(x) cos kadz, 
ie ; 

i= - | to sim earch 


The n'"-partial sum of the Fourier series is 


n 


so) = se ne = = an yy (a, cos ka + by sin kx). 
k=—n k=1 


It is the projection of f onto the subspace of R(J,C) spanned by S, = 
(2 |= = ke) 


Remark 7.1 


If f : R > C is a 27-periodic function which is Riemann integrable over a 
closed interval of length 27, the Fourier series of f is the Fourier series of 
f:[-a,7] 9 C. 


Chapter 7. Fourier Series and Fourier Transforms 531 


Remark 7.2 


If J = |—L, L], the Fourier series of a function f € R(J,C) is the series 


= (=) 
a Ck CXp E 5 


k=—00 


L ink 
= [ t@ exp (-==) ag. 


Henceforth, we only consider the case where J is a closed interval of length 


27. Let us look at some examples. 


Example 7.4 


Find the Fourier series of the function f : [—7,7] — R defined as 


ye 


k=1 


Chapter 7. Fourier Series and Fourier Transforms 332 


Figure 7.3: The function f(z) = 2,-7 <a <ands,(z),l<n<5. 


Remark 7.3 


Let J = [—7, 7]. Given f in R(I,C), we call each 


= [ fear, keZ 
I 


a Fourier coefficient of f. The mapping §; from R(/, C) to C which takes 


a function f to c;, is a linear transformation between vector spaces. 


When f : J > Risa real-valued function, we usually prefer to work with 


the Fourier coefficients a, and b,. One can show that if f : [—7,7] > R 


is an odd function, then a; = 0 for all k > 0, so that the Fourier series of 


f only has sine terms. If f : [—a,7] > 


R is an even function, then 6; = 0 


for all & > 1, so that the Fourier series of f only has the constant and the 


cosine terms. 


Remark 7.4 


If f : 1 — Cisa function of the form 


f(z) = wae, 


keJ 


where J is a finite subset of integers, then the Fourier series of f is equal to 
itself. 


Chapter 7. Fourier Series and Fourier Transforms 533 


Example 7.5 


The Fourier series of a constant function f : 1 > C, f(x) = cis just c 


itself. 


Example 7.6 


Let f : [0,27] — R be the function defined as 
Z(G — al 


Find its Fourier series, and express it in terms of trigonometric functions. 


mS 


27 0 21 * 


Figure 7.4: The function f : [0,27] > R, f(a) = «(2a — x) and its entension. 


Solution 


When we extend f periodically to the function P a! IR, we find that 
when x € [0, 27], 


f(—@ + 2m) = (2 — 2)x = f(x) = f(z). 


Hence, f(x) is an even function. This implies that the Fourier series of 
f : [0,27] > R only has cosine terms. 


1 20 9 T : 
ag = -{ (ads — = | (Qrxa — x*)dx 
70 T0 


Chapter 7. Fourier Series and Fourier Transforms 534 


For k > 1, 


= =f (cos 20 — =f (Qrx — z”) cos xdx 


= yee 
= ( |! (2aa — z?) sin =) = | (27 — 2x) sin kde) 
T k o Jo 
= 2 x (20 — 2x) cos ka a E i cos kxadax 
k k Jo 


ae 
= 4+ apa sin ka] = — 


kp 


Therefore, the Fourier series of f is 


Example 7.7 


Let a and b be two numbers satisfying —7 < a < b < 7, and let g : 


[—7, 7] — R be the function defined as 


ie yD 


otherwise. 


Find the Fourier series of g in exponential form. 


Figure 7.5: The function g : [0, 27] — R defined in Example 7.7 and its entension. 


Chapter 7. Fourier Series and Fourier Transforms 535 


Solution 


Since g is piecewise continuous, it is Riemann integrable. 
ie ay a b— 
0 On a gle) ae 20 il - 


eo tkb _ e tka 


—2rik 


1 
7 OF 


“ale Je ~tkt do — oa =h eked = ms, 


kb __ etka 


Geo es Qrik 


Therefore, the Fourier series of g is 
mika pike = (err? 
k; 


=@ thee the 


1g, 


Example 7.8 


Find the Fourier series of the function f : [—7,7] > | 


Solution 
Since f is a real-valued even function, we only need to compute the Fourier 
coefficients 


Tw 
xsin x cos kxadx when k > 0. 


=i 


Let g : [—1,7] — R be the function g(a) = x. We have seen in Example 
7.4 that the Fourier series of g is given by 


Chapter 7. Fourier Series and Fourier Transforms 536 


From this, we find that 


1 
in2xrdx = ~b = : 
rsin 2edz = 5 2(g) 5 


and when k > 2, 


~ On 


=F 


ax(f) : / x (sin(k + 1)x — sin(k — 1)x) dx 


= - (be41(9) — br-1(9)) 


escorted 


2(-1)*-1 
k2—1 


Hence, the Fourier series of the function f : [—7, 7] > 


iS 


il =e 
1- poset A eoske 


Figure 7.6: The function f : [-7,7] > R, f(x) = xsinz and its periodic 


entension. 


At the end of this section, let us make an additional remark. 


Chapter 7. Fourier Series and Fourier Transforms 537 


Remark 7.5 Semi-Norms 


A semi-norm on a complex vector space V is a function || - || : V > 


which defines the norm ||v|| for each v in V such that the following hold. 
(a) For any v € V, ||v|| > 0. 

(b) For any a € C, and any v € V, |lav|| = jal||v|]. 

(c) For any u and v in V, |ju+ v|| < |/ul| + ||v]]. 

If in addition, we have 

(d) ||v|| = 0 if and only if v = 0, 


then || - || is called a norm on the vector space V. 
Proposition 7.2 and Proposition 7.4 justify that the L?-norm 


Illa = / f(a) Pde 


is indeed a semi-norm on the vector space R(J, C). 
There are other semi-norms on R(/,C). One of them which will also be 
useful later is the L'-norm defined as 


lll: = / Ife) de. 


The fact that this is a semi-norm is quite easy to establish. 


Chapter 7. Fourier Series and Fourier Transforms 538 


Exercises 7.1 


Question 1 


Let f : [-7, a] + R be a real-valued Riemann integrable function. 


(a) If f : [—-7,7] — R is an odd function, show that the Fourier series of 
f has the form 


[oe) y) 7 
Ss b, sin kx, where b;, = 2 | f(x) sin ka. 
k=1 0 


(b) If f : [—7,7] > R is an even function, show that the Fourier series of 
f has the form 


co TT 
ao 2, 
a + > ay cos kx, where a, = 7 i, f(x) cos kz. 


Question 2 


Find the Fourier series of the function f : [—1,7] > R, f(x) = |a|, and 


express it in terms of trigonometric functions. 


Question 3 


Find the Fourier series of the function f : [—7,7] > R, f(x) = 2?, and 
express it in terms of trigonometric functions. 


Question 4 


Find the Fourier series of the function f : [0,27] > R, f(x) = x?, and 
express it in terms of trigonometric functions. 


Question 5 


Find the Fourier series of the function f : [—7, 7] > R, f(x) = sin 2z, and 
express it in terms of trigonometric functions. 


Chapter 7. Fourier Series and Fourier Transforms 539 


Question 6 


Find the Fourier series of the function f : [—7,7] > R, 
if — mr 
sinz, Lf; VO Soe a. 
and express it in terms of trigonometric functions. 


Question 7 


Find the Fourier series of the function f : [—7,7] > R, f(a) = xcosax 
from the Fourier series of the function g : |—7, 7] > R, g(a) =z. 


Question 8 


Let xo be a point in the interval [a,b], and let f : [a,b] > Cand g: [a,b] > 
C be L?-equivalent Riemann integrable functions. Assume that both f and 


g are continuous at the point 29, show that f(a) = g(o). 


Chapter 7. Fourier Series and Fourier Transforms 540 


7.2 The Pointwise Convergence of a Fourier Series 


Let J = [—7, 7] and let R(J,C) be the vector space that consists of all Riemann 
integrable functions f : J — C that are defined on J. Each of these functions can 


be extended to a periodic function f : R + C so that f(x) = f(z) for all x in the 
interior of J. 


Given f € R(J,C), we define the Fourier series of f as the infinite series 


CO ; a CO 
‘ Cpe? = a + Ss" (a, cos kx + by sin kx) , 
k=—oo k= 1 
where 
~ f fae keZ 
= — x e 
Ck on e ) ’ 


1 

a, = — [re cos kadz, ke0; 
TSI 
1 

b, = — [re sin kadx, Ree hs 
TST 


The problem of interest to us is the convergence of the Fourier series. Given 
n > O, the n_partial sum of the Fourier series of f is 


n 
8, (2) = S- cele”, 
k=—-n 


We say that the Fourier series converges pointwise if the sequence of partial 
sum functions {s, : J — C} converges pointwise. Let us first give an integral 
expression for the partial sums s,,(2) of the Fourier series. By definition, 


5, (0) = ~ > ([ jet) et 
= ~ i, : ft) SO eat 


k=—-n 


If x € 2nZ, e*** = 1 forall —n < k < n. Therefore, 


s eo? — In +1. 


k=-—n 


Chapter 7. Fourier Series and Fourier Transforms 541 


If x ¢ 2nZ, e’** ~ 1. Using the sum formula for a geometric sequence, we have 


n ~i(nt+4)zx i(2n+1)x 
_. . € 2 e€ -—1 
> ee ime (1p el +... eM) = ~ x : 
_ ie eit — 1 
e 2 


Given a nonnegative integer n, the Dirichlet kernel D,, : R > Ris 


ie e277, 


otherwise. 


Our derivation above gives the following. 


Proposition 7.8 


Let J = |—7,7], and let f : I — C be a Riemann integrable function. 
The n-partial sum s,,(2) of the Fourier series of f has an integral 


representation given by 


salt) = 5 i) ” f(D, (0 — bat, 


where D.,, : R is the Dirichlet kernel given by 


2n+1, WAR Pa 
1 


sin = 


2 


: otherwise. 


Chapter 7. Fourier Series and Fourier Transforms 542 


Figure 7.7: The Dirichlet kernels D,, : |—7,7] ~ Rforl <n <5. 


Remark 7.6 


By definition, the Dirichlet kernel D,,(x) is equal to 


From this, one can see that D,,(x) is an infinitely differentiable 27-periodic 


function, and it is an even function. 


Recall that g : [a,b] — C is a step function if there is a partition P = 
{%o,1,.--,%1} of [a,b] such that for each 1 < j < 1,9: (%j-1,2;) > Cis 
a constant function. It is easy to see that g : [a,b] — C is a step function if and 
only if both its real and imaginary parts are step functions. The following theorem 


asserts that a Riemann integrable function f : [a,b] + R can be approximated in 
L' by step functions. 


Theorem 7.9 


Let f : [a,b]  C be a Riemann integrable function. For every ¢ > 0, there 
is a step function g : [a,b] + C such that 


i If (2) — g(a) |de <e. 


Chapter 7. Fourier Series and Fourier Transforms 543 


Let f(z) = u(x) + iv(x), where u : [a,b] — R and v : [a,b] > R 
are the real and imaginary parts of f. Assume that u; : [a,b] — R and 


v, : [a,b] > R are step functions such that 


i u(x) — u(2)|dx < = i lu(x) — v1 (a)|dx < = 


Let g : [a,b] > R be the function g = u; + iv;. Then g is a step function. 


By triangle inequality, we have 


i, |f(z) — g(a)|dx < / |u(x) — us(x)|dax +f |u(a) — v(x) |dax < e. 


Therefore, it is sufficient to prove the theorem when ff is a real-valued 
function. 


Given < > 0, since f : [a,b] + R is Riemann integrable, there is a partition 
P = {x,21,..., £1} of [a, b] such that 


Cai alae Ges ale a 


M;= sup f(a), 


Uj1SL<2; 


l 
L(f,P) = ) | m;(a;— 23-1), WO = inf Ff (2), 
j=l 


Uj1SLS2; 


are respectively the Darboux upper sum and Darboux lower sum of f with 


respect to the partition P. Define the function g : [a, b] > R by 
when L5-1 KK 5, 
and g(b) = f(b). Then g is a step function, and 


len Pee aa) for all x € [a, 6]. 


Chapter 7. Fourier Series and Fourier Transforms 544 


It follows that 


This completes the proof. 


Figure 7.8: Approximating a Riemann integrable function by a step function. 


An important tool in the proof of pointwise convergence of Fourier series is 
the Riemann-Lebesgue lemma. This lemma is important in its own right. Hence, 


we State it in the most general setting. 
Theorem 7.10 The Riemann-Lebesgue Lemma 


Let J = [a,b]. If f : J + C is a Riemann integrable function, then 


b 
Him | fae da — 0. 


Boo 


Chapter 7. Fourier Series and Fourier Transforms 545 


Given « > 0, Theorem 7.9 says that there is a step function g : [a,b] > | 
such that 


[ We) - sola <5 


This implies that 


i (f(x) — g(a) )e**da| < | |f(a) — g(a) |dx < =. 


Let P = {x,21,...,2,} be a partition of [a,b] such that for 1 < 7 < 1, 
g(x) = m; for all x in (x;_1, 23). Let M = max{|my|,..., ||}. Then 


ey . . i : IM 
| g(x)e?" dx = 7 (€'9% = a) a ne 
4 
Li—1 


4MI1 
It follows that if 6 > aa 


b l ape 
[ soerrar <b / 
jai |72 


a ja 


Therefore, 


[ seePeae| < | [U@) - ate pe*an 


This proves the assertion. 


Since 


we obtain the following. 


Corollary 7.11 


Let J = [a,b]. If f : J + C is a Riemann integrable function, then 


b 
lim J (zis prar— 0. 
Boo Ja 


Chapter 7. Fourier Series and Fourier Transforms 546 


Recall that a function f : [a,b] — C is piecewise continuous if there is 
a partition P = {2o,21,...,x1} of [a,b] such that foreach 1 < j < Lf: 
(x;-1,%;) —> C is a continuous functions. It is piecewise differentiable if there 
is a partition P = {%o,%,...,2,} of [a,b] such that for each 1 < j < J, 
f : (“j-1,2;) — Cis a differentiable functions. Obviously, if f : [a,b] + C is 
piecewise differentiable, it is piecewise continuous. The piecewise continuity and 
piecewise differentiability do not impose any conditions on the partition points. In 
the following, we introduce a class of functions which satisfy stronger conditions 
on the partition points. 

Definition 7.9 Strongly Piecewise Differentiable Functions 

A function f : [a,b] — C is strongly piecewise continuous if there is a 

partition P = {xo,21,...,x,} of [a,b] such that for each 1 < 7 < J, the 

limits 

fe(eja)= Tim f(e) and f_(e;) = lim f(a) 
j i 


exist, and the function g; : [v;~1,x;] > C defined as 


eee het — Lj-1 
oe if Uj1< <7; 


Gaye Le = Ry 


is continuous. If f is also differentiable on (x;_), v;), and the limits 


exist, we say that f : [a,b] — C is strongly piecewise differentiable. 


Notice that a strongly piecewise differentiable function is strongly piecewise 
continuous and bounded. Therefore, it is Riemann integrable. 
We have abused notation above and denote the limit 
km fet h) = flo) 


h-0+ h 


Chapter 7. Fourier Series and Fourier Transforms 547 


as f' (c). Strictly speaking, f’ (c) is the right derivative of f at c which is defined 


as 
san £e+h) - fl) 
im 
h—0+ h 
The two expressions are equivalent if f(c) = f,(c), meaning that f is right 


continuous at c. In fact, for the limit 


to exist, a necessary condition is f;(c) exists and is equal to f(c). However, 
here we do not require the function f to be continuous at the partition points x;, 
0 < y <l. We only require the function to have left and right limits at these 
points. Since f(c) = f,(c) = f_(c) only when f is continuous at c, we modify 
the definitions of f‘ (c) and f’ (c) for functions that can have discontinuity at the 
point c. 


If x is an interior point of J and f : J — C is differentiable at x, 


jan £24) = F@) 


h—0+ h 


= file) = fe) = f-(@) = lim 


Thus, if the function f : [a,b] — C is strongly piecewise differentiable, then for 


any x € (a,b), 


Example 7.9 


The function f : [—7,7] — R defined as 


if —t<2<0, 
if0<2<q7, 


is strongly piecewise differentiable. 


Chapter 7. Fourier Series and Fourier Transforms 548 


oN 


a 


Figure 7.9: The strongly piecewise differentiable function defined in Example 7.9. 


Lemma 7.12 


Let f : [-7,7] — C be a strongly piecewise differentiable function, and 
let f : R > C be the 27-periodic extension of f. Given x € R, define the 
function h : [0,7] + C by 


fet) fe=)—i.@)=_@ 


ae 
Sil 
2 


, te(0,7], (7.3) 


and h(0) can be any value. Then h : [0,7] — C is a piecewise continuous 
bounded function. Hence, h : [0,7] — C is Riemann integrable. 


t : ae 
When t € [0,7], sin se 0 only when t = 0. Notice that f is a bounded 
piecewise continuous function. Hence, h : [0,7] — C is piecewise 


continuous. For any positive number r that is less than 7, h is bounded 
on [r,7]. To show that h : [0,7] > C is bounded, it is sufficient to show 
that h(t) has a limit when t > 0+. Now 


lim A(t) = lim fle +t) +f@-t)-f@)-f@) 


to0t+ toot e 


Chapter 7. Fourier Series and Fourier Transforms 549 


eq. (7.2) gives 
Tne 2 


toot 


This completes the proof. 


Now we can prove the Dirichlet’s theorem. 


Theorem 7.13 Dirichlet’s Theorem 


Let f : [-7,7] — C be a strongly piecewise differentiable function, and 


let f : R — C be the 27-periodic extension of f. For every x € R, the 


Fourier series of f converges at the point x to 


F(x) + Fo) 
2 


Notice that the function fis also strongly piecewise differentiable on any 
compact interval. The n“-partial sum of the Fourier series of f is 


See) — ee Ge. where Ch = =f f(zje*" dz. 


k=—n 


For a fixed real number x, we want to show that s,,(x) converges to the 


number 


By Proposition 7.8, 


oe = [ f(t)Dp(x —t)dt 


Chapter 7. Fourier Series and Fourier Transforms 550 


Since f and D,, are 27-periodic functions, we find that 


eee es 
ee = | ie — t)D, (t)dt. 
Notice that 
ae Doe = Es ss ie eo dein 
27 pe, 
Therefore, 


1 aS 


Sn(x) -—u = oy || 


(Fw =e u) D,,(t)dt. 


Using the fact that D,, : R — R is an even function, we find that 


Sn(x) — u ae (Fe fa 1) 2u) D,,(t)dt 


L 
: (Fe ee) = 2u) Go 


sin — 
2 


P nanan (n+!) e] 


where h : [0,7] — C is the function defined by (7.3). By Lemma 7.12, 
h : [0,7] — C is Riemann integrable. By the Riemann-Lebesgue lemma, 


ss 1 
lim h(t) sin [(» -- 5) J co. 
This proves that 


Tins (a7) — ae 
N+ Co 


From the Dirichlet’s theorem, we can spell out the following explicitly. 


Chapter 7. Fourier Series and Fourier Transforms 551 


Corollary 7.14 


If f : |-7,7] > C is a strongly piecewise differentiable function, then 


its Fourier series converges pointwise. Denote by F’ : R — C the Fourier 


series of f. Then for any x € | 


where mo IR — Cis the 27-periodic extension of f. This implies that 
(a) If x € (—7, 7) and f is continuous at x, then F(x) = f(z). 


(b) If f is right continuous at —7, left continuous at 7, and f(—7) = f(z), 
then F(—n) = F(n) = f(—2) = f(a), 


Let us look at a few examples. 


Example 7.10 


Let f : [—7,7] — R be the function f(x) = x considered in Example 
7.4. Since f is a strongly differentiable function, the Fourier series of f 
converges pointwise. We have shown that the Fourier series is given by 


co 
sin = 
pe 
k=1 
For any x € (—7, 7), this series converges to x. When x = 7, it converges 


ew ew a) 
5 


to 
i= 


k 
When x = a since sin as is 0 when k = 2n, it is equal to 1 when k = 
4n + 1, and it is equal to —1 when k; = 4n + 3, we deduce that 


1 T \ ee ee | eG T 
gP(G)=1-g+5-at mala) -@ 


which is just the Newton-Gregory formula. 


Chapter 7. Fourier Series and Fourier Transforms 552 


Figure 7.10: Convergence of the Fourier series of the function f(x) = x,—-7 < 


U< TT. 


Example 7.11 


The function f : [0,27] > R, f(x) = x(27 — x) considered in Example 
7.5 is a strongly piecewise differentiable function. Hence, its Fourier series 


converges everywhere. Since f : [—7,7] + C is continuous and f(—7) = 
f(a), we find that F(a) = f(x) for all  € [—7,7]. In particular, setting 
x = (0 and x = 7 respectively, we find that 


21 


2 
3 


Chapter 7. Fourier Series and Fourier Transforms 553 


x 


—2T 0 21 


Figure 7.11: The function f : [0,27] > R, f(a) = «(2a — x) and its entension. 


Example 7.12 


In Example 7.7, we consider the function g : [—7, 7] — R defined as 
iba < & =D, 
0, otherwise, 


where a and b are two numbers satisfying —7 < a < b < m. Notice that g 
is a strongly piecewise differentiable function. Thus, its Fourier series 


(== = ener a (er = ere 


b-—a tee 
Se) 27 ee k 


1 


converges pointwise. If x € (a,b), Dirichlet’s theorem says that G(x) = 
g(x) = 1. Hence, for any x € (a, b) C (—7,7), 
= i) kc (ee — ee 


€ 
k 


Chapter 7. Fourier Series and Fourier Transforms 554 


Remark 7.7 


If we scrutinize the proof of Theeorem 7.13, we find that a necessary and 


sufficient condition for the Fourier series of a 27-periodic function f : R - 
C to converge at a point «x is that the limit 


lim [Oe sin [(n 45) dt 


n—- co t 


should exists. This is known as the Riemann’s localization theorem. 


Theeorem 7.13 says that if f : [-a,7] — R is strongly piecewise 
differentiable, then this limit exists. This is sufficient for most of our 
applications. 


Remark 7.8 Fourier Sine Series and Fourier Cosine Series 


Let L > 0, and let f : [0, LZ] + C be a Riemann integrable function defined 


on [0, Z]. We can extend f to be an odd function f, : [—L,L] — C by 
defining 


i 10), 
it a U, 
naeae if Oe od, 


We can also extend f to be an even function f. : [—L, L] > C by defining 


i he 
if bs ee, 


The Fourier series of f, : [-L, LZ] — C is called the Fourier sine series 
of f : (0, LZ] — C. The Fourier series of f. : [-L, LZ] — C is called the 
Fourier cosine series of f : [(0, LZ] > C. 


Chapter 7. Fourier Series and Fourier Transforms 553 


Exercises 7.2 
Question 1 
Consider the function f : [—7,7] > R, f(x) = 2° — 772. 
(a) Find the Fourier series of f. 


(b) Use the Fourier series to find the sum 


Question 2 


Let f : [—7, 7] be the function defined as 
r+, if —7o ze =< 0, 
LT, it OS < we. 
(a) Find the Fourier series of f. 
(b) Study the pointwise convergence of the Fourier series. 


Question 3 


Study the pointwise convergence of the Fourier series of the function f : 
[—a,7] > R, f(x) = |2| obtained in Exercises 7.1. 


Question 4 


Study the pointwise convergence of the Fourier series of the function f : 


[—7,7] > R, f(x) = 2? obtained in Exercises 7.1. 


Question 5 


Study the pointwise convergence of the Fourier series of the function f : 
[0,27] > R, f(x) = x? obtained in Exercises 7.1. 


Chapter 7. Fourier Series and Fourier Transforms 556 


7.3 The L? Convergence of a Fourier Series 


In this section, we consider the L?-convergence of a Fourier series. We first define 


L?-converges for a sequence of Riemann integrable functions. 


Definition 7.10 L?-Convergence 


Let I = |a, }| be an interval in R, and let {f,, : J — C} be a sequence of 
Riemann integrable functions. We say that {f,, : J — C} converges in L? 
to a function g: J > Cin R(/, C) if 


lim || fn — gl| = 9. 
n—>0o 


In the vector space (J, C), we have nonzero functions h : [ — C which has 
zero norm. Hence, if {f,, : J + C} converges in L? to a function g : I — C, the 
function g is not unique. Nevertheless, we have the following. 


Theorem 7.15 


Let J = |a, }| be an interval in R, and let {f, : J —> C} be a sequence of 
functions in R(J,C) that converges in L? to the two functions g; : I > C 
and go: I + Cin R(I,C), then g; and gz are L?-equivalent. 


By triangle inequality, 


Gh = oe)|| Sy oi || SP ig = Ge]. 


Since 


lim ||fn— gi|| =9 and lim || fn — gall = 0, 
Noo N—->0o 


we find that ||g,; — go|| = 0. Thus, g, and gy are L?-equivalent. 


Chapter 7. Fourier Series and Fourier Transforms 557 


Example 7.13 


Consider the sequence of functions { f,, : [0,1] + R} defined as 


: m : 
if « = — for some integer m, 
n 


otherwise. 


Since f,, : [0,1] — R is a function that is nonzero only for finitely many 
points, we find that || f,,|| = 0. This implies that { f,, : J > R} converges in 
L? to the function fo : I — R that is identically zero. However, {f;, : I > 


R} does not converge pointwise. Take for example the point 7) = 1/2. 
Then f,(%o) = 1 if n is even and f,(%o) = 0 if n is odd. Hence, the 
sequence {f;,(%o)} does not converge. In other words, for sequences of 
functions, pointwise convergence and L?-convergence are different. 


The Fourier series of a Riemann integrable function f : J — C converges in 
L? to the function f : I > Cif 


lim |S» — f|| = 0. 
Noo 


Here s,,(x) is the n“-partial sum of the Fourier series. The main theorem we want 
to prove in this section is the Fourier series of any Riemann integrable function 
f : 1 > C converges in L? to f itself. We start with the following theorem which 
asserts that a Riemann integrable function f : [a,b] + C can be approximated in 
L? by step functions. 


Theorem 7.16 


Let f : [a,b] + C be a Riemann integrable function. For every < > 0, there 


is a step function g : [a, b] + C such that 


If —gll <e. 


Chapter 7. Fourier Series and Fourier Transforms 558 


As in the proof of Theorem 7.9, it is sufficient to consider the case where 


the function f is real-valued. Since f : [a,b] + R is Riemann integrable, it 
is bounded. Therefore, there exists \/ > 0 such that 


|f(x)| <M for all x € [a, 0]. 


Given ¢ > 0, Theorem 7.9 says that there is a step function g : [a, b] > 
such that 


b es 
[ ise) -9@)lde < 555. 

By the construction of g given in the proof of Theorem 7.9, we find that 

|g(x)| < M for all x € [a,b]. Therefore, 


(f(@) — gle)” S |F(@) — g(@) IF (a) + 9(@)| < 2M|f(@) — g@)]. 


This implies that 


Ifa? = f Fe) oleae samt f[s(e) — gle) <e 


Hence, || f — g|| < e. 


Theorem 7.17 
Let J = |—7,7]. Given a Riemann integrable function f : J — C, let 
Sa c,e'** be its Fourier series, and let s,,(x) = Ne c,e"** be the n'- 


k=—0o k=—-n 


partial sum. We have the followings. 


(a) For each n > 0, ||s,,||? = 27 Ss lex|?. 


k=—n 


(b) For each n > 0, we have the Bessel’s inequality ||s,,|| < || f]]. 


(c) The Fourier series converges in L? to f if and only if 


lim |[snl[? = [fll 
Noo 


Chapter 7. Fourier Series and Fourier Transforms 559 


For n € Z, let d, : R > C be the function ¢,(z) = e’"*. Then S = 
{Gn |n € Z} is an orthogonal system of functions in R(J, C), and ||¢,|| = 
V/2r for all n € Z. For n > 0, the set S, = {¢d,| —n < k <n} spans the 
subspace W,,, and 


le) = (ake le) 


Since S,, is an orthogonal system, the generalized Pythagoras theorem says 


that 
nm n 
lIsnll? = So leellleell? = 2 S— |ex|?. 


k=—n k=—n 
This proves part (a). 
For part (b), recall that f — s, is orthogonal to s,. By generalized 
Pythagoras theorem again, 


IFIP = [sn + F = Sn dM? = UI5nll? + LF = Snll? = Msn’. 


Hence, we find that ||s,,|| < || f||. This proves part (b). 
Part (c) follows from 


FIP = Ilsnll? + IF — snl’. 


The Fourier series converges in L? to f if and only if lim ||s, — f|| = 0, if 
N—-0o 
and only if 


lim ||$,||° = ILfIl’- 
n—-0co 


Remark 7.9 


Part (b) of Theorem 7.17 says for the trigonometric series SS cpe"** to be 


k=—oo 


the Fourier series of a Riemann integrable function, it is necessary that the 


[oe] 
series y \c.|? is convergent. 


k=—0oo 


Chapter 7. Fourier Series and Fourier Transforms 560 


Now we will prove that the Fourier series of a special type of step functions 
converges in L? to the function itself. 


Theorem 7.18 


Let a and b be two numbers satisfying —7 < a < b < 7, and let g : 


[—7, 7] — R be the function defined as 


iG ee. 


otherwise. 


The Fourier series of g converges in L” to the function g. 


By Theorem 7.17, it is sufficient to show that 


lim ||s,||° = Ilgll’. 
n—-oo 


Tv b 
II? = / qlee = / ee 


In Example 7.7, we have seen that the Fourier coefficients of g is 


e7ikb = etka 
and Ch = oe when k = 0. 


By part (a) of Theorem 7.17, 


\|Sn|2 = 20 (lai +S “(lee)? + a) 


k=1 
2 w@ (ee = ee (st = ca) 
+ 4m > rere 
k=1 
2 Se 1 — cos k(b— a) 
i ie 


k=1 


Chapter 7. Fourier Series and Fourier Transforms 561 


By Example 7.11, 


for all x € [0, 27], 


Therefore, 


b=) 2 2 cos hoa) 
] 2 _ | (a 
eel oe ae 


(b— a)? 4a 1 
Dea a Dar 
=)=0= |@\[- 


Ge haa (0-4?) 


This proves that the Fourier series of g converges in L? to g. 


Now we can prove our main theorem. 


Theorem 7.19 L? Convergence of Fourier Series 


Let J = |—7, | and let f : J + C be a Riemann integrable function. Then 
the Fourier series of f converges in L? to f itself. 


Since we will be dealing with more than one functions here, we use s,,(f) 
to denote the n"-partial sum of the Fourier series of f. 
We will show that given « > 0, there exists a positive integer NV such that 
for alln > N, 

I[sn(f) — fll <e. 


Fixed ¢ > 0. Theorem 7.16 says that there is step function g : |—7,7] >] 
such that 


& 
= Se 
If-al<5 


Chapter 7. Fourier Series and Fourier Transforms 562 


Let P = {2o,%1,...,2,} be the partition of [—7,7] such that for each 
1 <j <1, gis constant on (x;_1,2;). Define g; : [—7, 7] > C by 


if Lj1<U< Xj, 


otherwise. 


g(a) +--+ g(x) for all c € [—2, 7] \ P. 


Since Riemann integrals are not affected by function values at finitely many 
points, it follows that for each n > 0, 


l 
Sn(g) = Sa 
j=l 
By Theorem 7.17, 


IIsn(F) — 8n(9)Il = IlsnF — g)Il S If — gll < _ 


By triangle inequality, 


lg — Sn(g)|| = da — 8n(95))]] < yy ll9i — $n(9z)Il- 


By Theorem 7.18, lim ||g; — sn(g;)|| = 0 for 1 < 7 < J. Therefore, 
N—- Ooo 


lim |g — sn(9)|| = 0. 
N—- Oo 
This implies that there is a positive integer N such that 
lg — Sn(g)|| a= for alln > N. 
It follows that for alln > N, 


If — sn(P)Il SMF — ll + M19 — Sn(Q)I + [lsn(F) — Snlg)ll < €. 


This completes the proof. 


Chapter 7. Fourier Series and Fourier Transforms 563 


A consequence of Theorem 7.19 is the following. 


Theorem 7.20 Parseval’s Identity I 


Let f : [-7, 2] — C be a Riemann integrable function, and let 
L a —ika 
— piles. aa, keZ 
T Jog 
be its Fourier coefficients. Then 


f(a) Pda = ||f\)? = 2 ST (cel? 


k=—co 


In Theorem 7.19, we have shown that the Fourier series of f converges in 


EPO. By Theorem 7.17, this means that 


lim ||sn|° = [I fl’. 
Noo 


n 
lsu? = 20 SO lex, 


k=—n 


the Parseval’s identity follows. 


Corollary 7.21 Parseval’s Identity II 


R be a real-valued Riemann integrable function, and let 


= -{ {(@)coskads, = 0, 


1 Tv 
= -{ f(x) sin kadx, eel 


be its Fourier coefficients. Then 


[tae = IFIP = ri B+¥( (a + 6) )} 


Chapter 7. Fourier Series and Fourier Transforms 564 


This can be proved using co = = € R, and when & > 1, 


dp — iby ay, + iby, 
Ch = 9 = 


and a, and 0; are real numbers. 


Let us look at a few examples. 


Example 7.14 


In Example 7.4, we have seen that the Fourier series of the function f : 
[-7,7] — R, f(x) = vis 


an" 
2 
at = ——. 
Has 3 


This gives 


an identity we have obtained before. 


Example 7.15 


In Example 7.5, we have seen that the Fourier series of the function f : 
[0,27] > R, f(x) = «(2x — a) is 


Chapter 7. Fourier Series and Fourier Transforms 565 


Using Parseval’s identity, we have 


A, = 1 
HES etl ea a 


20 20 
= | x(n — x)*dx = | (4072? — 4a? + x) dx 
0 0 
Aes Le 
= 3 — Tx + =| 


32 32 
2 Sos 
(5 vs 


Therefore, 


. 
a) 15 


From the Parseval’s identity, we can also obtain the following. 


Theorem 7.22 


Let J = [—7,7]. Given that f : J + Candg : I > C are Riemann 
integrable functions, let 


ae J tka cs ‘J —tkx 
=5 f feletar, — alg)= 5 f alayear 


be their Fourier coefficients. Then 


[feo g(a)dx = (f,g So, aL) 


k=—oco 


This follows from Theorem 7.20 and the polarization formula 


i) = TSE Ga SE) = = a) 


We can use Theorem 7.22 to prove that Fourier series can be integrated term 


Chapter 7. Fourier Series and Fourier Transforms 566 


by term. 
Theorem 7.23 Term-by-Term Integration of Fourier Series 


Let J = |—7,7]. Given that f : J — C is a Riemann integrable function, 


(oe) 
let SS c,e"*” be its Fourier series. On any compact interval J = [a, b] 


k=—0o 


that is contained in J, we can integrate term by term and obtain 


[ seo = 3 Ck ie eda 


k=—0o 


Let g : |-7, 7] — R be the function defined as 


ios 7 <0, 


otherwise. 


Using Theorem 7.22, we find that 


[ seoar= [" 1eaeu 


= 20 Yo ex(f)ex(9) 


k=—00 

love) oe 
— x(t) f ein 
k=00 a 


This proves the assertion. 


Remark 7.10 


Theorem 7.23 is remarkable since we do not require the Fourier series of f 


to converge uniformly. 


Chapter 7. Fourier Series and Fourier Transforms 567 


Example 7.16 


In Example 7.4, we have seen that the Fourier series of the function f : 
[—-7,7] — R, f(x) = ris 


= 4-1 yi —— 


For x € [—7, 7], term-by-term integration gives 


a oo 9 x 
[a-Si | sin ktdt. 
k=1 


This implies that 


we find that 


Theorem 7.24 General Term-by-Term Integration of Fourier Series 


Let J = |—7,7]. Given that f : J + C is a Riemann integrable function, 


co 
let Sy c,e'** be its Fourier series. Let g : J —> C be any other Riemann 


k=—0o 


integrable function. On any compact interval J = [a, b] that is contained in 


i, f(x)g(x)dx = Sy Ck [9 )e** de. 


k=—oo 


I, we have 


Chapter 7. Fourier Series and Fourier Transforms 568 


R be the function defined as 


a= x= 0, 


otherwise. 


Hence, 


Using Theorem 7.22, we find that 


[ feoaceyae= J" poyrteyae 


= Dar SS Galen) 


This proves the assertion. 


Chapter 7. Fourier Series and Fourier Transforms 569 


Exercises 7.3 


Question 1 


Consider the function f : [—7,7] > R, f(x) = 2° — 17x whose Fourier 
series has been obtained in Excerses 7.2. Use Parseval’s identity to find the 


sum 
s as aes ence 
4 8 36 ° 46 


Question 2 


Use term by term integration to find the Fourier series of the function f : 
[—7,7] > R, f(x) = 2°. 


Chapter 7. Fourier Series and Fourier Transforms 570 


7.4 The Uniform Convergence of a Trigonometric Series 


In this section, we consider uniform convergence of a trigonometric series. In 
volume I, we have studied uniform convergence. If a series of continuous functions 
converges uniformly, then it represents a continuous function, and the series can 
be integrated term-by-term. 

Applying to trigonometric series, we have the following. 


Theorem 7.25 
co 
If the trigonometric series Se c,e"*” converges uniformly, it defines a 


k=—0o 


continuous 27-periodic function F' : R — C. The Fourier series of F' : 
co 


[—7, 7] — C is the series > cpe’*” itself. 


k=—0co 


For k € Z, the function ¢;(x) = e'** is a 27-periodic continuous function. 
co 


The assertion that the trigonometric series y c,e'** defines a continuous 


k=—0o 
27-periodic function F’ : R — C follows from the uniform convergence. 


Since each ¢;(z), k € Z is Riemann integrable, and the series S- ce 
k=—00 


converges uniformly, we can integrate term by term to find that 


it T oe 1 CO T 
oF) = = | Biles dn — ae ) a | ele dao, 
l=—0o ri 


=r 


This proves that the Fourier series of F : [—7,a] — C is the series 
co 


Sy c,e" itself. 


k=—0o 


We would like to have a sufficient condition for a trigonometric series 


lo) 
: Ch eke 


k=—0o 


to converge uniformly. In volume I, we have discussed Weierstrass /-test for 


Chapter 7. Fourier Series and Fourier Transforms 571 


real-valued functions. This can be generalized to complex-valued functions in a 
straightforward way. Let {f, : D — C} be a sequence of functions defined on a 
subset D of R. Assume that for each n € Z", there is a constant V,, so that 


lfn(z)| < Mn for all x € D. 


CoO 
The Weiertrass \/-test states that if S$ Mn is convergent, then the series of 


n=1 


functions S- f(x) converges absolutely and uniformly. 
n=1 


Using Weierstrass /-test, we can deduce the following. 


Theorem 7.26 


oe) 


If the series SS |c.| is convergent, then the trigonometric series 


k=—0o 
oo 


i c,e’*” converges absolutely and uniformly, and it defines a continuous 
k=—oo 


function F' : R — C whose Fourier series is itself. 


For any k € Z, 


|c,et**| < |cg| for all z € R. 


By Weierstrass (/-test, the series S- Gee converges absolutely and 
k=—0o 
uniformly. The rest follows from Theorem 7.25. 


Example 7.17 


Consider the series 


Chapter 7. Fourier Series and Fourier Transforms 372 


[o-e) 
. ae : : 
Since y a is a p-series with p = 
2) 
k=1 


function F’ : R — R defined as 


F(z)=)> 


k=1 


is a continuous function whose Fourier series is itself. 


Remark 7.11 


By Remark 7.9, in order for the series ys c,e"*” to be the Fourier series 


k=—0o 
oo 


of a Riemann integrable function, it is necessary that the series i |e, |" 


k=—0o 
oo 


is convergent. However, the convergence of S- lcz|? does not imply the 


k=—0o 


(oe) 
convergence of S \cz|. 


k=—0o 


Theorem 7.25 gives a criterion for a function defined as a trigonometric series 
to have Fourier series that is equal to itself. However, we will usually start by a 


Riemann integrable function. 
Theorem 7.27 
Let J = |—7,7], and let f : J — C be a Riemann integrable function. If 


the Fourier series Ss c,e'"* of f : I > C converges uniformly, it defines 


k=—0o 


a continuous 27-periodic function F : R + C whose restriction to J is L? 
equivalent to f : I — C. If the periodic extension f : R — C of f is 


continuous at xo, then F'(xq) = f (20). 


Chapter 7. Fourier Series and Fourier Transforms 573 


Co 
The fact that SS c,e'*” defines a continuous periodic function F' : RC 


k=—00 
oo 


has been asserted in Theorem 7.25. Since y c,e"*” is the Fourier series 


k=—0o 


for both f : J — Cand F': I > C, Theorem 7.19 says that it converges 
in L? to f : IT — Cand F : I + C. Theorem 7.15 then asserts that 
F : I + Cand f : I — C are L? equivalent. The values of two L?- 
equivalent functions agree at a point where both of them are continuous. 


Example 7.18 


In Example 7.5, we have seen that the Fourier series of the function f : 
(0, 27] > R, f(x) = «(2x — z) is 


2 — cos k 
F(a) = =n? 4) 
k=1 


3 ke? 


co 
; ee oar 
Since the series y pis convergent, and f : [0,27] — R is continuous 


k=1 
with f(0) = f(27) = 0, Theorem 7.26 and Theorem 7.27 imply that the 


Fourier series F(x) converges uniformly to the periodic extension of the 
function f(x). 


Example 7.19 


In Example 7.4, we have seen that the Fourier series of the function f : 
[-7,7] —R, f(x) = ris 


Since the harmonic series s k is divergent, we cannot apply Theorem 
k=1 
7.26. However, we can argue that the Fourier series does not converge 


uniformly in the following way. 


Chapter 7. Fourier Series and Fourier Transforms 574 


In Example 7.10, we have used Dirichlet’s theorem to conclude that F’(x) 


converges to f(x) for any x € (—7,77). It is obvious that F(a) = 0. 
Now 
lihooy Ge) Vibe ee) hey en Gh 


SESE Sue SE EUS 
This shows that F’ is not continuous at 7 = 7. Hence, the convergence of 


the Fourier series is not uniform. 


Remark 7.12 


Theorem 7.25 and Theorem 7.27 can be regarded as uniquessness of Fourier 


series. 


Next we want to consider term-by-term differentiation. 


Example 7.20 


Let us consider again the function f : [—a,7] > R, f(x) = x, whose 
Fourier series is given by 


pet —— ko 


k=1 


Since f : (—7,7) — R is continuously differentiable with f’(zx) 


the extension function f : R — R is continuously differentiable at all the 
points where x ¢ (2n + 1)Z, and 


fi(c)=1, when x ¢ (2n+1)Z. 


Hence, for the function f’ : [—7, 7] > R, regardless of how it is defined at 
+7, its Fourier series is the constant G(x) = 1. However, if we differentiate 


the Fourier series of f : [—71,7] — R term-by-term, we obtain the series 


SS 2(—1)*-! cos ka. 


k=1 


Chapter 7. Fourier Series and Fourier Transforms 575 


As |ax| = 2 for all k € Z, ye |a;,|? is divergent. Thus, the series 


Kt 


cannot be the Fourier series of any Riemann-integrable function. 


Example 7.20 shows that term-by-term differentiation can fail even though the 


function f : |—7,7] — R and its derivative are strongly piecewise differentiable. 


Theorem 7.28 


Let J = [—7,7] and let f : J — R be a Riemann integrable function 
[oe) [oe 
with Fourier series SS c,e'**. Tf the series oy Cr is convergent, and 


k=—00 k=—oo 


[o-e) 
the series Ne kc,e"*” converges uniformly, then f : J > R is L?- 


k=—0o 


equivalent to the continuously differentiable function F' : I > 


co 


y c,e'*”. Moreover, the Fourier series of F’ : I + R is 


k=—0o 


[oe) 


8 1kx 
y wke.e 


k=—oo 


which converges to F’(x) for all x € I. 


By Theorem 7.25, the uniformly convergent series 


(oe) 


y ikc,e** 


k=—0o 


defines a continuous function G : I — C, whose Fourier series is itself. 


Chapter 7. Fourier Series and Fourier Transforms 576 


H (a) = [ (wars Se Ge 


k=—0o 
By fundamental theorem of calculus, H is differentiable and H’(x) = 
G(a). Thus, H(z) is continuously differentiable. Since the series 


(oe) 


) ikc,e** 


k=—0o 


converges uniformly, we can do term-by-term integration to obtain 


co 


a ike [ eta Sy Ce S eee =. 


0 


k=—00 k=—0o k=—0o 


Hence, the function F': J > C, 


Ge = Ss cee” 


k=—0co 


is continuously differentiable, with derivative 
F"(x) = H'(x) = G(z) 


This completes the proof. 


Let us look at an example. 


Example 7.21 


Consider the trigonometric series 


Co a 
sin kx 


i3 and 


k=1 


CO CO 
; . 1 1 2 
Since the series y B and y Bp are convergent, both the series 
k=1 k=1 


Chapter 7. Fourier Series and Fourier Transforms 577 


*\ sinkr *. cos kx 
ss 73 and 72 


(p= (g=il 


converge uniformly, and so they define continuous functions. Let 


S\ sinkr *\ cos kx 
3 2° 
k=1 k =i k 
Since 
dsinkx _ coskx 
dx k3  k? 
F is continuously differentiable and F’(x) = G(x). 


for all k € Z*, 


At the end of this section, we give a brief discussion about the Cesaro mean 
of a Fourier series. As an application, we give another proof of the Weiestrass 


approximation theorem. 


Definition 7.11 Cesaro Mean of a Fourier Series 


Given a Riemann integrable function f : |—7,7] — C-, let its Fourier series 
be 


=: sl 63 
S- Gee where cy = = | Gale ies 
Cau ae 


k=—0o 


For n > 1, the n Cesaro mean of the Fourier series is 


Ce So(x) + 51(2x) veep fen) 


n 
sc) = SS cye* 


k=—n 


is the n"-partial sum of the Fourier series. 


Chapter 7. Fourier Series and Fourier Transforms 578 


Proposition 7.29 


Let J = |—7,7], and let f : I — C be a Riemann integrable function. 
The n' Cesaro mean o,,(2) of the Fourier series of f has an integral 


n= fe ES Oe 


where F,, : R is the kernel 


representation given by 


ie 277, 


otherwise. 


By Proposition 7.8, 


o)=5- | #)Dale— de 


where D,,(t) is the Dirichlet kernel. This gives 


a=5 | f(t) Fa(a — tat, 


where 
IDs) qe ange tae Se ID ila) 


Using the fact that 


it 2S 2nZ, 


.. + (2n — 1) 


= ile 


Chapter 7. Fourier Series and Fourier Transforms 79 


When «x ¢ 27Z, 


This completes the proof. 


Definition 7.12 
For n > 1, the Fejér kernel Ff, : R is the kernel given by 


if x € 2nZ, 


otherwise. 


A good property about the Fejér kernel is F,,(t) > 0 for allt € R. 
Now we can prove the following theorem. 


Theorem 7.30 


Let f : |-7,2] > C bea continuous function with f(—a) = f(z), and let 
o,,() be the n Cesaro mean of the Fourier series of f. Then the sequence 


of functions {o,, : |[—7,7] — C} converges uniformly to the function f : 


[—7,7] > C. 


Chapter 7. Fourier Series and Fourier Transforms 580 


Figure 7.12: The Fejér kernels F,, : |—7, 7] > Rfor2 <n < 6. 


As in the proof of the Dirichlet’s theorem, we find that 


y= = i Gee 


where ae IR — C be its 27-periodic extension of f. Since 


1 AE 
— yaa for all n > 0, 
Pie ee 


1 Tv 
se ff Falta =i for alln > 1. 


It follows that for « € |—7, 7], 


at 
On 


an(v) = f(a) = 5 f (Fle— 1) = Fle) Falta 


For x € [—7, a] andt € [—7,a], x —t € [—27, 2a]. Since f : [—7, a] > 
C is a continuous function with f(—7) = f(z), f : [—27, 27] — C is 
continuous. Hence, it is uniformly continuous. 
Given ¢ > 0, there exists 6 > 0 such that if u and v are in [—27, 27] and 
|u — v| < 0, then 

Fu) - FO) <5 


By continuity, if |w — v| < 6, then 


lf(u) — f)| < = 


Chapter 7. Fourier Series and Fourier Transforms 581 


Being continuous on a compact interval, the function fi: [—27, 2a] > Cis 
also bounded. Therefore, there exists 1 > 0 such that 


|f(x)} <M for all x € [—27, 27]. 
This implies that for any x € [—7, a] and t € [—7, 7], 
Fle —t) -F@| <2M. 
On the other hand, if 6 < |t| < 7, 
t ) 
sin” 5 > sin” 3 > 0. 


Therefore, 


Tea 


Sa when 6 < |t| < 7. 
nein’ $ 


Let N be a positive integer such that 
4M 
Fes 


ee 
€ Sin 5 


N> 
For n > N and x € [—7, 7], we have 


lon(a) — fla)| S 


20 JS 5<|t\<a 


We estimate the two terms separately. Since F,,(t) > 0 for all t € [—a, z], 


1 aye 
pe Fr(t)dt < = | Fp (t)dt = 1. 


2m Sit<s 


Hence, 


1 
2m Sit\<s 


Ne = =e) 


Chapter 7. Fourier Series and Fourier Transforms 582 


For the second term, we have 


: eo) = eee = i dt 


alte a — 
20 5<|t|<a 20 b<t|<m MSM” 5 


2M Ze 
~ Nsin? 2 a 


This shows that for all n > N, 
lon ti— 7 (a) << for all x € [—7, 7]. 


Thus, the sequence of functions {o,, : [—7,7] — C} converges uniformly 
to the function f : [—7,7] > C. 


Notice that since s,,(x) is in the span of S,, = {e** | —n <n <n}, on4i(2) 
is in the span of S,,(a). Now we apply Theorem 7.30 to give another proof of the 


Weierstrass approximation. 


Theorem 7.31 Weierstrass Approximation Theorem 


Let f : [a,b] + R be a continuous function defined on |a, b]. Given e > 0, 


there is a polynomial p(x) such that 


f(x) — p(a)| <e for all x € [a, O). 


It is sufficient to prove the theorem for a specific [a,b]. We take |a,b| = 


[(0, 1]. Given f : [0,1] — R is a real-valued continuous function, we extend 


it to be an even function f, : [—1,1] — R, and let g : |—-7, 7] — R be the 
function defined as 
Ooi (ees ae 


This is well-defined since the range of cos is [—1,1]. Since cosa and 


fe : [-1,1] — R are continuous even functions, g : [—17,7] > Risa 
continuous even function. Hence, we also have g(a) = g(—7). 


Chapter 7. Fourier Series and Fourier Transforms 583 


Given ¢ > 0, Theorem 7.30 implies that there is a positive integer n such 
that 
|g(x) — Onsi(x)| < e€. (7.4) 


Here o,,;;(2) is the (n + 1) Cesaro mean of the Fourier series of g. Since 


g : |[—7,7] — Ris a real-valued even function, the Fourier series of g has 
the form 


a [oe 

0 

a ) ay Cos kar, 
k=l 


where ax, k > 0 are real. This implies that 


n 
Oe — SS a, cos kx 
k=0 


for some real constants ag, Q1,...,Q@,. For any m > 1, cosmz can be 
written as a linear combination of 1, cos 7, cos? x,..., cos” x. This shows 


that there are real constants (9, 61,..., 8, such that 


n 
Onto — Se By, cos* x. 
k=0 


[ua eee 
k=0 
Then o,,41(”) = p(cos x). Thus, (7.4) says that 
|f-(cos x) — p(cosx)| < € for all x € [—7, 7]. 
This implies that 
| f(x) — p(x)| <e for all x € [0, 1], 


which completes the proof of the theorem. 


Chapter 7. Fourier Series and Fourier Transforms 584 


Remark 7.13 


In the proof of the Weierstrass approximation theorem given above, we do 
not use Fourier series since the Fourier series of a 27-periodic continuous 
function does not necessary converge uniformly. An example is given in 
[SSO3]. However, there are other approaches to prove the Weierstrass 
approximation theorem using Fourier series. For example, one can 
approximate a continuous function uniformly by a continuous piecewise 
linear function first. The Fourier series of a continuous piecewise linear 
function does converge uniformly to the function itself. 

In the proof given above, we used the even extension f, of the given 
function f. The Fourier series of f.(cos x) is a cosine series, so that the 


Cesaro mean is a polynomial in cosx. One can also bypass the even 


extension and the composition with the cosine function, using directly 
uniform approximation of trigonometric functions by Taylor polynomials, 
as asserted by the general theory of power series. 


Chapter 7. Fourier Series and Fourier Transforms 585 


Exercises 7.4 
Question 1 
Consider the function f : [—7, 7] defined as 
r+, if —rm<2<0, 
L-—T, ie OS ae < ar. 


The Fourier series of this function has been obtained in Exercises 7.2. Does 
the Fourier series converge uniformly? Justify your answer. 


Question 2 

Study the uniform convergence of the Fourier series of the function f : 
[—7, 7] + R, f(x) = x? obtained in Exercises 7.1. 

Question 3 

Show that the trigonometric series 


* (2k coskx + 3sin kx 
Ds = 


k=1 


defines a continuously differentiable function F’ : R — R, and find the 


Fourier series of the function EF” : [—7,7] > R. 


Chapter 7. Fourier Series and Fourier Transforms 586 


7.5 Fourier Transforms 


We have seen that the Fourier series of a function f : [—L, ZL] — C defined on 
[—L, L] is 
- inka 
2 eee ( L ) 5 


where the Fourier coefficients c,, k € Z are given by 


This is also the Fourier series of the 2L-periodic extension of the function /. 
Substitute the expression for c;,, we find that the Fourier series can be written as 


Ae ie of ink(x —t) 

— — ——_——_ | dt. i) 

= Ef foer( 7 ) t (7.5) 
Heuristically, 


3 i: imkt 
(iii a ae? 


k=—0o 


can be regarded as a Riemann sum for the function g : R > C, 


In the limit L — oo, one obtain heuristically the integral 


lee) 
a dw. 
—co 


= / / f (He?) dt du. 
20 —Cco —oo 


This motivates us to define the Fourier transform of a function f : R —+ Cas 


fw) = / ” f(ijeat. 


so that (7.5) becomes 


We know that under certain conditions, the Fourier series of a function would 


converge to the function itself. Hence, we can also explore the conditions in which 


fiz) = ~ ‘a [ f (the? dtdu = ~ a f(w)e* dw. (7.6) 


Chapter 7. Fourier Series and Fourier Transforms 587 


However, now the integrals we are working with are improper integrals. Therefore, 
there is another convergence issue that we need to deal with. In this section, we 
only give a brief discussion about Fourier transforms. An in-depth analysis would 
require advanced tools. 


We say that a function f : R — C is Riemann integrable if it is Riemann 
integrable on any compact intervals. 


Definition 7.13 L and L? Functions 


Let f : R — C be a Riemann integrable function. We say that f is L' if the 


improper integral 
| lt@lae 


is convergent. In this case, we define the L1-norm of f as 


ll = Hl ” [f(a)lde. 


—co 


We say that f is L? if the improper integral 
| \t@pas 


(oe) 


is convergent. In this case, we define the L?-norm of f as 


fle = i ” f(a) Pao. 


(oe) 


If f : [a,b] + Cis any Riemann integrable function, the zero-extension of f 
to R is both a L! and a L? function. As before, the L' and L? norms are semi- 
norms which are positive semi-definite, where there are nonzero functions that 


have zero norms. 


Example 7.22 


Consider the function f : R defined as 


5 ye 


Chapter 7. Fourier Series and Fourier Transforms 588 


The integral 


ot 1 
Cl 
[. Vr2+1 


is not convergent, but the integral 


is convergent. Hence, f : Ris L? but not Lt. 


Definition 7.14 Fourier transform 


Let f : R — C be a Riemann integrable function. The Fourier transform 
of f, denoted by F[f] or f, is defined as 


FIfl(w) = flw) = i ” f(t)e“*at, 


for all the w € R which this improper integral is convergent. 


Example 7.23 


R — Cis a L'-function, for any w € R, the integral 
/ es de 


converges absolutely. Hence, a L' function has Fourier transform f which 


is defined on R. In particular, a function that vanishes outside a bounded 


interval has a Fourier transform that is defined for all w € R. 


Proposition 7.32 


Fourier transform is a linear operation. Namely, if f : R > Candg: R—- 


C are functions that have Fourier transforms, then for any complex numbers 


qa and (, the function af + Gg : IR — C also has Fourier transform, and 


Flaf + Bg] = oF [f] + BF(g). 


Chapter 7. Fourier Series and Fourier Transforms 589 


Remark 7.14 


In engineering, it is customary to use ¢ as the independent variable for the 


function f : R — C, and w as the independent variable for its Fourier 


transform a : R — C. The function f is usually a function of time f, 


and its Fourier transform is a function of frequency w. Hence, the Fourier 


transform is a transform from the time domain to the frequency domain. 


Example 7.24 


Let a and b be two real numbers with a < b. Define the function g : | 
by 
if a = 1 <0, 


otherwise. 


The Fourier transform of g is 


b . 
fewa- 


Of special interest is when g : R — R is given by 


1, if —a<t<a, 
g(t) = (7.7) 
0, otherwise, 


which is an even function. Example 7.24 shows that its Fourier transform is 


~ 2 sin aw 
g(w) = 
Ww 


One can show that this function is not L! but is L?. 


Chapter 7. Fourier Series and Fourier Transforms 590 


Figure 7.13: The function g : R — R defined by (7.7) with a = 1. 


=< 


Figure 7.14: The Fourier transform of the function g : R — R defined by (7.7) 
witha = 1. 


Remark 7.15 


A function that vanishes outside a bounded interval is said to have compact 


support. In general, the support of a function f : R — C is defined to be 


the closure of the set of those points x such that f (2) 4 0. Namely, 


(support f) = {x € R| f(x) A O}. 


Since a set is bounded if and only if its closure is bounded, a function f has 
compact support if and only if the set of points where f does not vanish is 
bounded. 


Let us look at Fourier transforms of functions that does not have compact 
support. 


Chapter 7. Fourier Series and Fourier Transforms 591 


Example 7.25 


Let a be a positive number, and let f : R — R be the function defined as 
f(t) = e~". Find the Fourier transform of f. 


Solution 
The Fourier transform of f is given by 


ee . 
/ e ale twt yy 
—0oo 


L 


lim et (ce a et) dt 
Loo 0 


L 
lim (eee a ene) dt 


L—- co 0 


oe (atiw)t e7 (a-iw)t L 


= lim = : 
Loo a+ WwW a-—w jo 
1 1 

atw a-—itw 
2a 


a2 + w2- 


Notice that the function f: R-C, f(w)= 
DE} and L?. 


2a, : ; 
= —gisa function that is both 
Ww 


=X 


Figure 7.15: The function f : R > R, f(t) = e7!4, 


A function f : R — R of the form 


f(x) =cexp (-S*) ; (7.8) 


202 


Chapter 7. Fourier Series and Fourier Transforms 592 


Jy 
A 
0 — 
: oS ~ 2 nar : 
Figure 7.16: The function f : R > R, f(w) = Tao? which is the Fourier 
W 


transform of f : R > R, f(t) =e7"4. 


is the probability density function of a normal distribution with mean y and standard 


deviation 0 when ; 


J2ro 


It is also known as a Gaussian function. These functions are infinitely differentiable 


Cc — 


and they decay exponentially to 0 when «x gets large. When yp = 0 anda = 1, 


is the probability density of the standard normal distribution. The Fourier transform 


t? 
of the Gaussian function f(t) = exp (-5) is 


As a a, 
f(w) / e ze “dt 


I| 
o 
Co 
8 8 
oO 
34 
ce 
| 
No] re 
—- 
+ 
>. 
£ 
bo 
SU 
se 
+ 


2 
=vV27e 2. 
In the computation, the equality 


a 1 o 0? 
i exp (-3 (t iw?) a= | e 2dt 


can be understood in complex analysis as shifting contours of integrations. We 


leave the details to the students. 


Chapter 7. Fourier Series and Fourier Transforms 593 


e ~ 
Notice that for the function f(t) = exp (-§] , its Fourier transform f(w) is 


equal to f(w) multiplied by 27. Namely, 
fw) = Van fw). 


The factor 27 here is due to our normalization. Different textbooks use different 
conventions for Fourier transforms. Among them are the followings: 


| - f(the "dt, / - f(te*dt, 


l * —iwt l ~ twt 
Von [. ies Wore a Fejerrat, 
l = —iwt l ” dwt 


Some might also replace iwt by 27iwt. When one is reading about Fourier transforms, 
it is important to check the definition of Fourier transform that is being used. 

One can show that in our definition, the Fourier transform of the Gaussian 
function f(t) = e~“” with a > 0 is 


so that 
=F) = ae 
— WwW) = —-—~e a 
27 2a 
Jy 
A 
0 


Chapter 7. Fourier Series and Fourier Transforms 594 


Remark 7.16 


R — Cisa L! function and 


lll =) If (f)|dt = 0, 


(oe) 


we say that f is L'-equivalent to the zero function. If f : R — C is L!- 


equivalent to the zero function, then for any w € R, 


fea|=|f socal < [ rmemar= f irolar=o 


(oe) 


Thus, the Fourier transform of f is identically zero. 


Example 7.24 shows that the Fourier transform of a L1 function is not necessary 
L", Nevertheless, we have the following, which is an extension of the Riemann- 


Lebesgue lemma to L! functions on R. 


Theorem 7.33 Extended Riemann-Lebesgue Lemma 


If the function f : R — Cis L', then 


lim i HOE Gh = 


Boo 


In other words, the Fourier transform f : R — C of f is a function 


satisfying 


lim f(w) a: 
W500 


We are given that 


pair If (t)|dt < 00. 


oe) 


Given € > 0, there is a L > O such that 


1S; 
/ Mola < 5 


Chapter 7. Fourier Series and Fourier Transforms 595 


By triangle inequality, we have 


f Flea KK a Flea + 


Fleetat 


|(|2L 


For the second term, we have 


Flea = i; 
| 


t>L 


f(ee|at = f F(b)|dt < £. 


[t]>L [IDL 2 


By the Riemann-Lebesgue lemma, 


EL 
lim, f FOE Gi = 0. 
= 


B-0 


Therefore, there exists IM > 0 such that if G > M, then 


iE - 
/ Fleetat —— 
=f 2 

It follows that for all 6 > M, 


Ul Fleetat ie 


This proves the assertion. 


The following theorem imposes a strong condition on a function g : R > C 


to be the Fourier transform of a L' function f : R > C. 


Theorem 7.34 


If f : R > Cis a L’ function, then its Fourier transform i : 


uniformly continuous. 


We are given that 


Chapter 7. Fourier Series and Fourier Transforms 596 


Without loss of generality, we can assume that J > 0. Notice that for any 


Ww, and w in R, 


Fleer) = Flin) = fp (ett ea. 


Given € > 0, there is a L > O such that 


é 
the |f(t)|dt < 3° 


By triangle inequality, we have 
fle) — Fen] < f° co emt — ee" at 
: i LF Je — | dt + i \F(@)| [emt — | et 
tI<L \t>Z 
The second term is easy to estimate since |e~*#!’ — e~“*2"| < 2. We have 
—iwyt —tiwet 2 
lf@l er" —e-™"|at< 2] [fF @lat < =. 
tL 


lt>L 3 


Since the function g : R — C, g(u) = e™ is continuous at u = 0, there 
exists a 0 > 0 such that if |u| < 6, then 


n) 
Thus, given w and wy in R, if Jw; — we| < T then for any t € |—L, L], 
(wy = We )t| < L|w; me Ws| =O: 


It follows that 


S 


au 


lees = gore = ctor —wayt ~— 1| Zz 


Therefore, 


| le mt —e #8 gi < = t)|dt < =. 
[voll Jaeas fields 


Chapter 7. Fourier Series and Fourier Transforms 597 


ft) 
This proves that whenever |w, — w2| < T then 


a~ a~ 


f(w1) = (we) < €; 


~ 


Hence, f : R — C is uniformly continuous. 


Example 7.26 


By Example 7.24, the Fourier transform of the L' function f : | 


if ep oe 


otherwise, 


x Qsi 
isa om’. Theorem 7.34 then implies that the function g : 
W 


sin x 


is uniformly continuous. 


Motivated by the heuristics (7.6) from the theory of Fourier series, we make 
the following definition. 


Definition 7.15 Inverse Fourier Transform 


Given a Riemann integrable function a : R > C, we define its inverse 
Fourier transform by 


FA) = 5 [Flue 


for all the ¢ € R where this integral is convergent. 


~~ 


Notice that if f : R > C is a L! function, then F~![f](t) exists for all t € R, 
and 


FAW = 5 FIA. 


In other words, we have the following. 


Chapter 7. Fourier Series and Fourier Transforms 598 


Proposition 7.35 


Let f : R — C be a Riemann integrable function. If f : R — C has 


Fourier transform p : R — C, and the function e : R — C has inverse 


Fourier transform given by h : R — C-, then the function g : R — C, 


~ 


g(t) = f(t) has a Fourier transform given by 
g(w) = 2mh(—w). 


Example 7.27 


For the function f : R > C, f(t) = e~*, its Fourier transform is f: R-> 
C, 


Therefore, 


2 ~ 
g(w) = V2 x V4raexp | — a ells me = 2nF *[f]|(—w). 


Ae 
oars 


Example 7.27 shows that for the function f : R > C, f(t) = ewe have 


In general, we are interested in the following. If f : R — C is a Riemann 


integrable function with Fourier transform f : R — C, under what conditions 


~ 


does F~'[f](t) exists and 
FLAG) = FO? (7.9) 


Chapter 7. Fourier Series and Fourier Transforms 599 


Example 7.28 


For the function g : | 


2a 
a? + w? 


gw) = 


with a > 0, 


one can use contour integration techniques in complex analysis to show that 
when t € R, 


1 eo iwt 
F Ig] (t) i du = eal 


-_ 2 2 
PAN ee a CS 


Hence, for the function f : R > C, f(t) = e~*", we also have 


F(fl@) =f) forallte R. 


Definition 7.16 Fourier Transform Pairs 


R — Cand g : R > C are Riemann integrable functions, and 
Ff) = 9), 


then we call the pair of functions (f, g) a Fourier transform pair. 


Example 7.29 


For a > 0, let f : R > C be the function f(t) = e~*", and let g: RC 

be the function g(t) = Pee Then Example 7.28 says that (f,g) is a 
a 

Fourier transform pair. 


The following is important for the proofs later. 


Theorem 7.36 


The function f : R > ] x is an infinitely differentiable even 
x 


function that satisfies 


Voce jus a 
ee 


Chapter 7. Fourier Series and Fourier Transforms 600 


sin x 


The fact that f : R > R, f(x) = 
established earlier. The formula for the improper integral can be proved using 


UZ 


is infinitely differentiable has been 


contour integration techniques and the fact that the function g(z) = has a 
z 
simple pole at z = 0 with residue 1. See for example [CB84]. 


Corollary 7.37 


For any a > 0, 


* sin Lx 
i 
0 x 


ee 
: sin Lx ; 
lim dx = lim 

L- oo 0 6 Loo 


Therefore, 


Now we can prove our main theorem. A function f : R — C is said to be 


strongly piecewise differentiable if it is strongly piecewise differentiable on any 


compact intervals. 


Theorem 7.38 Fourier Inversion Theorem 


R — C bea L'-function that is strongly piecewise differentiable, 


fo / * f(ijeat 


be its Fourier transform. Then for any x € R, 


lim ee ig flw)e*dw = cea) 
=r, 


Loo 277 Dp 


Chapter 7. Fourier Series and Fourier Transforms 601 


Notice that 


ic f(w)e* dw = [. [fe f(t)’ dtdw 
-[ [. f(a — te *dtdw. 


To continue, we need a technical lemma which guarantees we can interchange 


the order of integrations. 


Lemma 7.39 


Let f : R — C be a function that satisfies the conditions in Theorem 7.38. 
Then for any L > 0, we have 


o i f(a —t)e'dtdw = [. ie f(a — the™*dwdt. 


Assuming this lemma, we can continue with the proof of Theorem 7.38. 


Proof of Theorem 7.38 Continued 
By Lemma 7.39, we have 


io flw eae — [. iE f(x — the™*dwdt. 


Now we can integrate the integral with respect to w and obtain 


[iw cd =2 He jsinlty, 


: sin Lt , : 
Using the fact that 5 is an even function, we find that 


ih oo = 
/ foe dp—2 ee) = saa sin Ltdt. 
=i 0 


Chapter 7. Fourier Series and Fourier Transforms 602 


Split the integral into two parts, we have 


L 
i fw)e*dw 2 [ sie ea) sin Ltdt 
ay 0 t 


+a f” f(a+t)+ f(a —-t) 


7 sin Ltdt. 


Let 
f(a) + f(x) 
5 
As in the proof of Lemma 7.12, the function h : [0,1] — C with 


f(a+t)+ f(a —t) —2u 


h(t) = 


when t € (0, 1] 


is a Riemann integrable function. Thererfore, the Riemann-Lebesgue 
lemma implies that 
1 
lim h(t) sin Ltdt = 0. 


LI-0oo 0 


It follows from Corollary 7.37 that 


1 = * si 
| eae a et) sin Ltdt = lim tu —— = 2ru. 
i 0 


t Loo 


lim 2 
Loo 


On the other hand, 


[Peseta cof If(@)ldt < oo. 


(oe) 


By the extended Riemann-Lebesgue lemma, 


fe ze (Ces fle =a) 


5 sin Ltdt = 0. 


LI-0o 


This completes the proof that 


fla) + -() 
; 


OU pe ee 
lim = | He == 
=F 


Loo 27 


Now we prove Lemma 7.39. 


Chapter 7. Fourier Series and Fourier Transforms 603 


Proof of Lemma 7.39 


i elees 


there is an 1Z > O such that 


E 
how |f(t)|dt < mh 


Since e’”* is an infinitely differentiable function, and f(t) is a piecewice 


Given € > 0, since 


continuous function on any compact intervals, Fubini’s theorem implies 


a+M 
J [3 Je dtd = ie [. f(a — te duwdt. 
a—M xr 


that 


of i fae Heats — is _ Hestatda] 
a—M 


Se 
SS Lf oq, - lara < oe a= 


On the other hand, since | sin Lt] < L|t| for all t € R, we have 


a+M 
Ui if f(z —te™*dwdt — i. a se ~ teat 


L 
zg <2 | fie oy ES “ar <a |f(t)\dt <<. 
|c—t|>M t \{>M 2 


This proves that 


ie ie f(a —te'dtdw -f- iL f(x -t) ja es 


Since ¢ > 0 is arbitrary, the assertion follows. 


Chapter 7. Fourier Series and Fourier Transforms 604 


Corollary 7.40 


Let f : R ~ C bea L! function that is continuous and strongly piecewise 
differentiable, and let 


foe / ” f(ije*at 


be its Fourier transform. If e : R > Cis also a L' function, then for any 
Cok 


Example 7.30 


Since the function g : 


if —a<t<a, 


otherwise, 


is strongly piecewise differentiable L' function with Fourier transform 


2sin aw 


?) 


WwW 


the Fourier inversion theorem implies that for |t| < a, 


L 5 
1 sinaw ,. 
lim — eG = I, 
Loo ap W 


while if |t| > a, 
L = 
Peer vil! sinaw , 
lim — ee ih 10) 
L->oo 1 ay (G0) 
and for |t| = a, 
L & 
ell sinaw , 1 
lim — eee da = 
Eee Fi) a we 2 


If f : R > Candg : R > C are L? functions, then the Cauchy Schwarz 


Chapter 7. Fourier Series and Fourier Transforms 605 


inequality implies that for any L > 0, 


(/ ; soatte) <([ ropa) ( f ; (Pat) 
<(f_ireopar) (7 lamrar) 


This implies that the improper integral 


/ ” f(d)g(at 


converges absolutely. Thus, we can define a positive semi-definite inner product 


on the space of L? functions on R by 


(f,9) = a f(t)g(t)dt. 


The L? semi-norm is the norm induced by this inner product. 
The following is a generalization of the Parseval’s identity to Fourier transforms. 


Theorem 7.41 Parseval- Plancherel Identity 


If f : R > Cis a Riemann integrable function that is both L' and L?, then 
its Fourier transform f : R > C is a L? function. Moreover, 


ee 
2 2 
se = IIB. 


f ltwpae= = floras. 


Sketch of Proof 
A rigorous proof of this theorem requires advanced tools in analysis. We 
give a heuristic argument for the validity of the formula (7.10) under the 


additional assumption that f : R — C is continuous and strongly piecewise 
differentiable, and i. R > Cis also L!. 


Chapter 7. Fourier Series and Fourier Transforms 606 


Since f: R — C is a continuous and strongly piecewise differentiable L' 


function, the Fourier inversion theorem implies that for all ¢ € R, 


f= FAW == f Flejear 


Notice that f : R > C is a L?-function if the limit 
lim 
L>o J_y, 


exists. By the definition of Fourier transform, 


[ \Feofa= [FO fo noe™aan 


a 


By Theorem 7.34, f(w) is uniformly continuous. By the Riemann- 


Lebesgue lemma, lim F( ) = 0. These imply that the function f:R3> 
W—-=x00 


C is bounded. Using the same reasoning as in the proof of Lemma 7.39, we 
can interchange the order of integrations and obtain 


[feof ae= [1 [Fein 


Since ie R > C is L', we can take the L —> oo limit under the integral 
sign. Since 


is io) 
tim, f Flue ted = | f(w)e*dw, 
—-L —oo 


I-00 


we conclude that 


i “Wit Nf a= 20 Hol 


L-oo iE a 


Chapter 7. Fourier Series and Fourier Transforms 607 


Example 7.31 


For the function f : R + C, f(t) = e774 with a > 0, its Fourier transform 


a~ 


isf: RC, fw) = 


a : 
———.. Notice that 
a? + w? 


ff \eopae=2 f° ema = =. 
—oo 0 


The Parseval-Plancherel formula implies that 


meet 


One of the applications of Fourier transforms 1s to solve differential equations. 


For this we need the following. 
Theorem 7.42 


Let f : R  C be a continuously differentiable L+ function such that 


lie (2 Oe 


t—=k00 


and its derivative f’ : R — C is a Riemann integrable function that has 
Fourier transform. Then 


F[f'\(w) = wF[f](w). 


This follows from integration by parts. For any a and b with a < 6, 


b : | 
/ Glen ai = fe], +iw f Ripjen dn 


The assertion follows by taking the limit a + —oo and b > oo. 


Example 7.32 


Find the Fourier transform of the function f : | 


Chapter 7. Fourier Series and Fourier Transforms 608 


By linearity, 


Therefore, 


Solution 


: R > C be the function 


In the following, we consider an operation called convolution. 


Definition 7.17 Convolution 


Let = 


R —> Candg: 


R — C be Riemann integrable functions. The 


convolution of f and g is the function defined as 


Notice that the improper integral defining f * g is convergent for any x in R 


(f *g)(w )= f fle- dott) t)dt = [tos (x —t)d 


whenever this integral is convergent. 


when f and g are L? functions. Convolutions can be defined for a wider class of 


functions. For example, if the supports of the functions f and g are both contained 


Chapter 7. Fourier Series and Fourier Transforms 609 


in [0, co), then the integral is only nonzero when 0 < t < z. This gives 


(f+ g)(@) = / " f(i)g(x — dat, 


which is also well-defined for any x € R. In fact, this is the convolution one sees 
in the theory of Laplace transforms. 


Example 7.33 


R be the function defined as 


1, if0<2<1, 


otherwise, 


and let g : R — R be the function g(x) = x. Then 


(fea)(e) =f sae-aat = fe dat = 0-5. 
For f « f, we have 


if =< 0) 
1S 7 1, 
ne il < ae < 


f+ f(a) = f ee 


iit =e SB 


f bie a 


Figure 7.18: The function f : R — R defined by (7.11) and the function f * f. 


Convolution usually smooths up a function, as shown in Figure 7.18. 
In the theory of Fourier transforms, convolution plays an important role because 
of the following. 


Chapter 7. Fourier Series and Fourier Transforms 610 


Theorem 7.43 


: RR — Cand g: R — C be functions that are both L' and L?. Then 
: R > Cis an L! function and 


Fl f * 9\ = F[flF (gl. 


Sketch of Proof 


Fubini’s theorem implies that 


[lteot@lars ff fle -sllattlatas 
< [lot nf | f(x —#)|dxat 


=itlh f 9lde = Illa 


This shows that f «g is an L' function. By Fubini’s theorem again, we have 


FA + gill ai D fle —fa\dte "dr 


t) i: Galen a’ tres dé 


g(the "dt = F[f\(w)F[g](w). 


:-R-] 


iin etc: 
hi) <eeale 
if 7 
1 


Chapter 7. Fourier Series and Fourier Transforms 611 


Solution 
By Example 7.33, g = f * f, where f is the function given by 7.11. The 
Fourier transform of f is 


ee ew 


iW 
Therefore, the Fourier transform of g is 


nc) 
= oe Asin? — 
y 


gw) = fw) x fw) =e™ 


Ww 


Now we list down some other useful properties of Fourier transforms. The 
proofs are left as exercises. 


Theorem 7.44 


Let f : R— C bea L’ function and let a be a real number. 


(a) Ifg:] 


(712) 


where c is a positive constant. This is the called the wave equation. The 


function wu is a function in (t,2) € R?. For simplicity, we assume that wu, 
Ut, Uz are infinitely differentiable bounded L' functions which decays to 0 
when t — oo. 

Let u(w, x) be the Fourier transform of u with respect to the variable t. 
Then 


F uy) (w, 2) = —w?ti(w, x). 


Chapter 7. Fourier Series and Fourier Transforms 612 


It can be justified that 

02 
Thus, under Fourier transform with respect to t, the partial differential 
equation (7.12) is transformed to a second order ordinary differential 


equation 


Ww 
ea (w, 2) + zulu, ei =C (7.13) 


with respect to the variable x. The general solution is 
fi(w,x) = A(w)ee* + B(w)e"e* 


for some infinitely differentiable functions A(w) and B(w). Assume that 


respectively. These give 


Ute — A( 


Then 
u(t, 2) = d(x + ct) + o(a — ct). 


This shows that the solution of the wave equation can be written as a sum 
of a left-travelling wave ¢(x + ct) and a right-travelling wave q(x — ct). 


Chapter 7. Fourier Series and Fourier Transforms 613 


Exercises 7.5 


Question 1 


Let f : R ~ C bea L! function and let a be a real number. Define the 
function g : R > C by 
g(t) = f(t —a). 


Show that 


Question 2 


Let f : R — C bea L! function and let a be a real number. Define the 
function g : R > C by 

g(t) = fe. 
Show that 


n~ 


G(w) = f(w —a). 


Question 3 


Find the Fourier transform of the function f : R — C. 
1 
(a) f(t) = Pad 
1 


(OO) reereere 


sint 
© {0 ay aE 


Question 4 


Let f : R > R be the function f(t) = e~*!, and let g : R > R be the 
function g(t) = (f * f)(t). Use convolution theorem to find the Fourier 


transform of the function g : R > R. 


Chapter 7. Fourier Series and Fourier Transforms 614 


Question 5 


Let a and b two distinct positive numbers, and let f : R — R and g : 
R = R be the functions f(t) = e~®” and g(t) = e~””. Find the function 
h:R-— R defined as h(t) = (f * g)(t). 


Question 6 


Let f : R > C bea bounded L! function. Show that f is L?. 


Appendix A. Sylvester’s Criterion 615 


Appendix A 
Sylvester’s Criterion 


In this section, we give a proof of the Sylvester’s criterion, which gives a necessary 
and sufficient condition for a symmetric matrix to be positive definite. The proof 
uses the LDU factorization of a matrix. 

Given an n X n matrix A and an integer 1 < k < n, the k™ principal submatrix 
of A, denoted by M;,(A), is the & x k matrix consists of the first k rows and first 
k, columns of A. The Sylvester’s criterion is the following. 


Theorem A.1 Sylvester’s Criterion for Positive Definiteness 


Ann X nm symmetric matrix A is positive definite if and only if det M;, > 0 


for all 1 < k <n, where M,, is its k principal submatrix. 


For a positive integer n, let M,, be the vector space of n x n matrices, and 
let £L,, U, and D,, be respectively the subspaces that consist of lower triangular, 
upper triangular, and diagonal matrices. Also, let 


Ly = {L € L,, all the diagonal entries of L are equal to 1}, 
U,, = {U €U, | all the diagonal entries of U are equal to 1}. 


Notice that L is in £,, if and only if its transpose L” is in Un. 


The set of n x n invertible matrices is a group under matrix multiplication. 


This group is denoted by GL(n,R), and is called the general linear group. As 
a set, it is the subset of M,, that consists of all the matrices A with det A ¥ 0. 
The group GL(n, R) has a subgroup that contains all the invertible matrices with 


determinant 1, deonoted by SL(n, R), and is called the special linear group. The 


sets Ly and U,, are subgroups of SL(n, R). 


If Ais ann X n matrix, an LDU factorization of A is a factorization of the 
form 
A= LDU, 


Appendix A. Sylvester’s Criterion 616 


where L € Tae DeED,, and U € ih. Notice that det A = det D. Hence, A is 
invertible if and only if all the diagonal entries of D are nonzero. 

The following proposition says that the LDU decomposition of an invertible 
matrix is unique. 


Proposition A.2 Uniqueness of LDU Factorization 


If A is an n X 7n invertible matrix that has an LDU factorization, then the 


factorization is unique. 


We need to prove that if L,, L2 are in Le U,, U2 are in Vie D,, D2 are in 
D,,, and 
L, DU, = L2D Ua, 


then Ty = Lo, U, = Us and dD, = Dao. 
eth — fe Ly and —U5U . Then 


LD, = DU. 


Notice that L is in i and LD, is in £L,. Similarly, U is in Un, and DoU 
is in U,,. The intersection of £,, and U,, is D,,. Thus, there exists D € D,, 
such that 


(DD iiga= JOO OF 


Since A is invertible, D, and Dz are invertible. Hence, 
i Dee andes) 


are diagonal matrices. Since all the diagonal entries of L and U are 1, we 
find that DD; * = I,, and Dy'D = I, where I, is the n xn identity matrix. 
This proves that 

Di pe 


But then L = [,, = U, which imply that ZL) = Lz and U; = U2. 


Appendix A. Sylvester’s Criterion 617 


Corollary A.3 


(i) Given Ip € Ly, if Lo is invertible, it has a unique LDU 
decomposition with U = I, the n x n identity matrix. 


(ii) Given Up € U,, if Up is invertible, it has a unique LDU 
decomposition with L = J, the n x n identity matrix. 


It suffices to establish (1). The uniqueness is asserted in Proposition A.2. 


For the existence, let Lo = [aij], where aj; = 0 if i < 7. Since Lp is 
invertible, a;; A 0 for all 1 <i < n. Let D = {d;;| be the diagonal matrix 
with d;; = a, for 1 < i <n. Then D is invertible. Define L = L)D™!. 
Then L is a lower triangular matrix and for 1 <2 <n, 


This shows that L is in hoe Thus, Lo = LD is the LDU decomposition of 
Lo with U = I. 


The following lemma says that multiplying by a matrix L in L£,, does not affect 
the determinants of the principal submatrices. 


Lemma A.4 


Let A be an n X n matrix, and let L be a matrix in on If B = LA, then for 
1l<e<*M, 
det M;,(B) = det M,(A). 


Appendix A. Sylvester’s Criterion 618 


For L € ieee M,(L) is in oe and NV;,(L) is the zero matrix. Now B = LA 
implies that 


My(B) | Ne(B) 
P.(B) | Qx(B) 


This implies that 
M,(B) = My(L)M,(A) + Ne(L)Pe(A) = Ma(L)My(A). 
Since M;,(L) € Re det M;,(L) = 1. Therefore, 


det M,(B) = det M;(A). 


Lemma A.4 has an upper triangular counterpart. 
Corollary A.5 


Let A be an n X n matrix, and let U be a matrix in Us. If B = AU, then for 
L<h <n, 
det M;,(B) = det M,(A). 


Sketch of Proof 
Notice that that M,(B7) = M;,(B)', and B? = U7 A’, where U” is in 
L,. The result follows from the fact that detC? = det C for anyk xk 
matrix C’. 


Now we prove the following theorem which asserts the existence of LDU 
decomposition for a matrix A with det M,(A) £0 forall l <k <n. 


Theorem A.6 
Let A = [a;;] be ann xn matrix such that det M;,(A) ¢ Oforalll <k <n. 


Then A has a unique LDU decomposition. 


Appendix A. Sylvester’s Criterion 619 


Notice that M,,(A) = A. Since we assume that det M,(A) # 0, A is 
invertible. The uniqueness of the LDU decomposition of A is asserted in 
Proposition A.2. 

We prove the statement by induction on n. When n = 1, take L = U = [1] 
and D = A = |a| itself. Then A = LDU is the LDU decomposition of A. 
Let n > 2. Suppose we have proved that any (n — 1) x (n — 1) matrix 
B that satisfies det M(B) 4 0 for 1 < k < n—1 has a unique LDU 
decomposition. 

Now assume that A is ann x n matrix with det M,,(A) 4 0 forall 1 <k < 
n. Since det Mi(A) = au, a = a1, # 0. Let L, = [L,,| be the matrix in 


L£,, such that for 2 <i<n, 


ral 
Diy — a, 
a 


and for2 <j <i<n, Li; = 0. Namely, 


ne | 0 | 
Py(L1) | fn—1 


where 


Notice that 


and 


C= i 


is a matrix with 


Appendix A. Sylvester’s Criterion 620 


By Lemma A.4, 
det M;,(C) = det M;(A) forall 1 <k <n. 


Let B = Qi(C). Then B is an (n — 1) x (n — 1) matrix. Since P,(C) = 0, 
we find that forl1 <k <n-—1, 


det Mz41(C) = adet M;,(B). 
This shows that 
det M(B) 4 0 forall <<k<n-—1. 
By inductive hypothesis, B has a unique LDU decomposition given by 
B=LpDpupz. 


Now let L» be the matrix in £, given by 


One can check that 


N 
Copia Otc ay 
oDs0n 


Let L = L,L5. Then LF is in eee Since DpUz is an upper triangular 


(n — 1) x (n — 1) matrix, Z~1A is an upper triangular n x n matrix. By 
Corollary A.3, L~'A has a decomposition 


pA = Dw, 


where D € D,, and U € U,,. Thus, A = LDU is the LDU decomposition 
of A. 


Now we can complete the proof of the Sylvester’s criterion for a symmetric 
matrix to be positive definite. 


Appendix A. Sylvester’s Criterion 621 


Proof of Sylvester’s Criterion 
Let A be an n X n symmetric matrix. First we prove that if A is positive 
definite, then for 1 < k < n, det M,(A) > 0. Notice that M;,(A) is 
also a symmetric matrix. For u € R*, let v be the vector in R” given by 
y = (122-0) Then 


v’ Av =u’ M,(A)u. 


This shows that 1/;,(A) is also positive definite. Hence, all the eigenvalues 
of M;,(A) must be positive. This implies that det (A) > 0. 

Conversely, assume that det M;,(A) > 0 for all 1 < k < n. By Theorem 
A.6, A has a LDU decomposition given by 


A= 1D. 
Since A is symmetric, A? = A. This gives 


UTDTLT = AT =A=LDU. 


Since U7 is in ee and L” is in U,,, the uniqueness of LDU decomposition 
implies that U = L7. Hence, 


A=LDL. 
By Lemma A.4 and Corollary A.5, 
det M;,(A) = det M;(D). 


If D= [di], let T= dij. Then M;,(D) = TO) o oo Iho Since det M;(A) > 0 
forall1 <k <n, 7; > 0 forall 1 <i <n. By the invertible change of 
coordinates y = L’ x, we find that if x € R” \ {0}, 


x" Ax = y’ Dy = Ty, + Typ +--+ YZ > 0. 


This proves that A is positive definite. 


Appendix B. Volumes of Parallelepipeds 622 


Appendix B 
Volumes of Parallelepipeds 


In this appendix, we give a geometric proof of the formula for the volume of a 


parallelepiped in R”. 


Theorem B.1 


Let Y be a parallelepiped in R” spanned by the linearly independent 


vectors V1,...,Vn- Then the volume of F is equal to | det A|, where A 


is the matrix whose column vectors are V;,..., Vn- 


Let us look at a special case of parallelepiped where this theorem is easy to 


prove by simple geometric consideration. 


Definition B.1 Generalized Rectangles 


A parallelepiped that is spanned by n nonzero orthogonal vectors wi, ..., 


w,, is called a generalized rectangle. 


A generalized rectangle F based at the origin and spanned by the n nonzero 
orthogonal vectors wi, ..., Wn is equal to B(Q,,), where Q,, = [{0,1]" is the 


standard unit cube, and B is the matrix 
Balef ea). 


By geometric consideration, the volume of F is given by the product of the lengths 


of its edges. Namely, 


vol (f) = ||wil| --- |[wall- 
To see that this is equal to det B, let u;, . . ., u,, be the unit vectors in the directions 
of Wi, ..., W,. Namely, 

u; = ae Ll<i<n. 


~ [Iwill 


Appendix B. Volumes of Parallelepipeds 623 


Then B = PD, where P is an orthogonal matrix and D is a diagonal matrix given 


respectively by 
Iwi] O +. 0 
0 |lwal| --- 0 
Psliel|e pig); DS -, ge BY 
0 O +++ ||wall 


Ann X n matrix P is called an orthogonal matrix if 
P?’P=PP' = I,, 


where I,, is the m xX n identity matrix. A matrix P is orthogonal if and only if 


the column vectors of P form an orthonormal basis of R”. If P is orthogonal, 
P-' = P?, and P~' is also orthogonal. From P? P = I,,, we find that 


det(P) det(P7) = det(I,) = 1. 


Since det(P7) = det(P), we have det(P)? = 1. Hence, the determinant of an 
orthogonal matrix can only be 1 or —1. Therefore, when 6 = PD, with P and D 
as given in (B.1), we have 
| det B| = | det P det D| = | det D| = ||wi|| --- ||w,|]. 
Remark B.1 


In the argument above, we do not show that the volume of a generalized 
rectangle spanned by the n nonzero orthogonal vectors Ww, ..., W,, 1S equal 


to ||w;|| --- ||w,,|] using the definition of vol (R) in terms of a Riemann 


integral | dx. This is elementary but tedious. 
R 


A linear transformation T : R” > R”, T(x) = Px defined by an orthogonal 


matrix P is called an orthogonal transformation. The significance of an orthognal 


transformation is as follows. For any u and v in R”, 


(T(u), T(v)) = (Pu)? (Pv) =u? P? Pv = u'v = (u,v). 


Namely, T preserves inner products. Since lengths and angles are defined in terms 
of the inner product, this implies that an orthogonal transformation preserves 
lengths and angles. 


Appendix B. Volumes of Parallelepipeds 624 


Under an orthogonal transformation, the image of a rectangle RF is a rectangle 
that is congruent to R. Since the volume of a Jordan measurable set D is obtained 
by taking the limit of a sequence of Darboux lower sums, and each Darboux lower 
sum is a sum of volumes of rectangles with disjoint interiors that lie in 9, we find 
that orthogonal transformations also preserve the volumes of Jordan measurable 
sets. 


Theorem B.2 


If T : R" — R", T(x) = Px is an orthogonal transformation, and D is a 


Jordan measurable set, then T(D) is also Jordan measurable and 


vol (T(®)) = vol (D). 


To finish the proof of Theorem B.1, we also need the following fact. 


Proposition B.3 
Let Y be a parallelepiped based at the origin and spanned by the vectors 
Vi, ..-, Vy). Assume that 


eee forl <i<n-l, 


or equivalently, v1,...,Vn—1 lies in the plane x, = 0. Forl <7 <n—1, 
let z; € R”' be such that v; = (z;, 0). If Q is the parallelepiped in R"~' 
based at the origin and spanned by z:, ..., Zn—1, then 


vol (Y) = vol (Q)h, 


where fh, is the distance from v,, to the x, = 0 plane, which is given 
explicitly by 


h = | proj, Vn| - 


When n = 3, this can be argued geometrically. For general n, let us give a 


proof using the definition of volume as a Riemann integral. 


Appendix B. Volumes of Parallelepipeds 625 


Recall that 


oe {tiv + +++ +t,_1Vn_-1 + tnVn | te [0, Ley : 


Notice that v,, can be written as v,, = (a, h) for some a € R”'. Hence, if 


a point x is in Y, then 


<= Laer 
i h 3] ’ 


where 0 < t <h, and z isa point in Q. For 0 <t < A, let 


a={(jarat)fosesnl, 


Then it is a (n — 1)-dimensional parallelepiped contained in the hyperplane 
Xn = t, which is a translate of the (n — 1)-dimensional parallelepiped Qo. 


By Fubini’s theorem, 


h 
vol =|) ax= | (| try dey) On. 
ge 0 Or 
h 


=| aly = [vot (Qa)at = vot (Q)h. 


0 


Now we can prove Theorem B.1. 


Proof of Theorem B.1 
We prove by induction on n. The n = 1 case is obvious. Assume that 
we have proved the n — 1 case. Now given that # is a parallelepiped in 


IR" which is spanned by v;,...,V,, we can assume that Y is based at the 
origin O because translations preserve volumes. Let 


A= [vi | ale 


We want to show that 
vol (FY) = | det A}. 


Appendix B. Volumes of Parallelepipeds 626 


Let W be the subspace of R”~' that is spanned by v,..., Vn_1. Applying 


the Gram-Schmidt process to the basis {vi,...,Vn} of R”, we obtain an 
orthonormal basis {ui,...,u,}. By the algorithm, the unit vector u,, is 
orthogonal to the subspace W. Let 


Pool | 


be the orthogonal matrix whose column vectors are u,,...,U,, and 


consider the orthogonal transformation T : R" — R”, T(x) = P7!x = 
P™x. For 1 <i <n, let 


Then P = T() is a parallelepiped that has the same volume as Y, and 
it is spanned by V1,...,V,. Notice that 


A=|¥% |---| ¥] =P? |v |---| va] 


T 
uy 


[vs | --+ |'¥n] = 


From this, we find that 
det(A) = det(B) x (uy, vp). 


Comparing the columns, we also have 


V; = (a, 0) forl <i<n-l, 


where Z1,...,Zy— 1 are the column vectors of B, which are vectors in 


and 
(uy, Vn) 


Vn = 
Cpe Vn) 


(Un, Vn) 


Appendix B. Volumes of Parallelepipeds 627 


The transformation T maps the subspace W to the hyperplane zx, = 0, 
which can be identified with R”~'. Let Q be the parallelepiped in R”~' 
based at the origin and spanned by the vectors z),..., Z,—1. The volume of 


the parallepiped Pis equal to the volume of Q times the distance h from 
the tip of the vector v,, to the plane x,, = 0. By definition, 


= 1 Proj. Vn|| =|, Va) |e 


Proposition B.3 gives 


vyou( 7) — vol(@) x |(n-w,,) |: 
By inductive hypothesis, 
vol (Q) = | det(B)|. 
Therefore, 
vol (Y) = |det(B) x (up, Vp)| = | det(A)]. 
Since A = P” A, we find that 


det(A) = det(P7) det(A) = +det(A). 


Hence, 


vol (4) = vol (#) = | det(A)|. 


This completes the proof of Theorem B.1. 


As a corollary , we have the following. 


Theorem B.4 


Let I be a closed rectangle in R”, and let T : R’ R”, T(x) = Ax be an 


invertible linear transformation. Then 


vol (T(I)) = | det A| vol (1). 


Appendix B. Volumes of Parallelepipeds 628 


nm 


Let I = ] [la:. bi. Then —) S(O, ay where ae — ig, a, 
i=1 
Qn is the standard unit cube [0,1]", and S : R" — R?” is the linear 


transformation defined by the diagonal matrix B with diagonal entries 


b; — a1, by — dg,..., bn — Gy. Therefore, 


T(I) = (ToS)(Qn) + Ta). 


Since the matrix associated with the linear transformation (T oS) : R” > 
R” is AB, T(I) is a parallelepiped based at T(a) and spanned by the 
columnn vectors of AB. By Theorem B.1, 


vol (T(I)) = | det(AB)| = | det(A)|| det(B)]. 


Obviously, 


n 


| det B| = [[@ —a;) = vol (I). 


i=1 
This proves that 
vol (T(I)) = | det A| vol (1). 


Remark B.2 
The formula 


vol (T(I)) = | det A| vol (I) 


still holds even though the matrix A is not invertible. In this case, det A = 
0, and the column vectors of A are not linearly independent. Therefore, 


T(I) lies in a plane in R”, and so T(I) has zero volume. 


Appendix C. Riemann Integrability 


629 


Appendix C 


Necessary and Sufficient Condition for Riemann 


Integrability 


In this appendix, we want to prove the Lebesgue-Vitali theorem which gives a 


necessary and sufficient condition for a bounded function f : D —> 


R to be 


Riemann integrable. We will introduce the concept of Lebesgue measure zero 


without introducing the concept of general Lebesgue measure. The latter is often 


covered in a standard course in real analysis. 
n 


Recall that the volume of a closed rectangle I = ] [la b;] or its interior 


i=1 


n 


vol (I) = vol (int I) = | [ (0; — ai). 


i=1 


If A is a subset of IR", we say that A has Jordan content zero if 


(i) for every ¢ > 0, there are finitely many closed rectangles I, ..., I, such that 


k k 
AC U I, and S- vol (I) < ¢. 
j=l j=l 


This is equivalent to any of the followings. 


(ii) For every < > 0, there are finitely many closed cubes Q, ..., Q, such that 
k k 
AC LJ a; and S| vol (Q;) <€ 
j=l j=l 
(iii) For every ¢ > 0, there are finitely many open rectangles Uj, ..., U;, such 
that 


k k 
AC U U; and S_ vol (U;) <e. 
j=l 


j=l 


Appendix C. Riemann Integrability 630 


(iv) For every ¢ > 0, there are finitely many open cubes Vj, ..., V;, such that 
k k 
AG Uv and S— vol (V;) me 
j=l j=l 


A set has Jordan content zero if and only if it is Jordan measurable and its 
volume is zero. Hence, we also call a set that has Jordan content zero as a set 
that has Jordan measure zero. The Jordan measure of a Jordan measurable set A 


is the volume of A defined as the Riemann integral of the characteristic function 


xa: ASR. 

In Lebesgue measure, instead of a covering by finitely many rectangles, we 
allow a covering by countably many rectangles. A set S' is countable if it is finite or 
it is countably infinite. The latter means that there is a one-to-one correspondence 
between S and the set Z*. In any case, a set S' is countable if and only if there is 


a surjection h : Z* — S, which allows us to write 
S={s,|keZt}, where s; = h(k). 


Definition C.1 Lebesgue Measure Zero 


Let A be a subset of IR”. We say that A has Lebesgue measure zero if for 


every € > O, there is a countable collection of open rectangles {U;|k € 


Z*\ that covers A, the sum of whose volumes is less than c. Namely, 


Ac Lu: and SB nee 


k=1 p= Il 


The following is obvious. 


Proposition C.1 


Let A be a subset of R”. If A has Jordan content zero, then it has Lebesgue 


measure Zero. 


The converse is not true. There are sets with Lebesgue measure zero, but they 


do not have Jordan content zero. The following gives an example of such sets. 


Appendix C. Riemann Integrability 631 


Example C.1 


Let A = Qn (0, 1]. The function x4 : [0, 1] > R is the Dirichlet’s function, 
which is not Riemann integrable. Hence, A is not Jordan measurable. 
Nevertheless, we claim that A has Lebesgue measure zero. 

Recall that Q is a countable set. As a subset of Q, A is also countable. 
Hence, we can write A as 


A= {a,|keZ*}. 


Given « > 0 and k € Z*, let U; be the open rectangle 


é E 
~ pbe2? Ue " =) 


Then a; € U;, foreach k € Z*. Thus, 


Therefore, A has Lebesgue measure zero. 


The converse to Proposition C.1 is true if A is compact. 


Proposition C.2 


Let A be a compact subset of R”. If A has Lebesgue measure zero, then it 


has Jordan content zero. 


Given <« > 0, since A has Lebesgue measure zero, there is a countable 


collection {U, | € Z*} of open rectangles that covers A, and 


Sy vol (U,) < €. 


aEeZt+ 


Appendix C. Riemann Integrability 632 


Since A is compact, there is a finite subcollection {U,, | 1 <1 < m} that 
covers A. Obviously, we also have 


SS vol (Ug,) < €. 
l=1 


Hence, A has Jordan content 0. 


Example C.2 


Using the same reasoning as in Example C.1, one can show that any 


countable subset of IR” has Lebesgue measure zero. 


We have seen that if A is a subset of R” that has Jordan content zero, then 
its closure A also has Jordan content zero. However, the same is not true for 
Lebesgue measure. 


Example C.3 


Example C.1 shows that the set A = QN/N [0, 1] has Lebesgue measure zero. 


Notice that A = [0, 1]. It cannot have Lebesgue measure zero. 


As in the case of Jordan content zero, we have the following equivalences for 


a set A in R” to have Lebesgue measure zero. 


(i) For every ¢ > 0, there is a countable collection of open rectangles {U;, | k € 
Z*} such that 


Ac Lu: and Sy witty <a 


k=1 k=1 


(ii) For every € > 0, there is a countable collection of closed rectangles {I;, | k € 
Z*} such that 


Ac Uk and S— vol (I) <e. 
k=1 k=1 


(iii) For every < > 0, there is a countable collection of open cubes {Vj | k € Zt} 


Appendix C. Riemann Integrability 633 


such that = 2 
Ac U V;, and Y= vol (Vi) =o. 


k=1 k=1 


(iv) For every ¢ > 0, there is a countable collection of closed cubes {Q; | k € 
Z*} such that 


AC v Q, and a (Qk) <6. 


k=1 k=1 
The following is obvious. 


Proposition C.3 


Let A be a subset of R”. If A has Lebesgue measure zero, and B is a subset 


of A, then B also has Lebesgue measure zero. 


Using the fact that the set Z+ x Z* is countable, we find that a countable union 


of coutable sets is countable. This gives the following. 


Proposition C.4 


Let {A,,|m € Z*} be a countable collection of subsets of R”. If each of 


(oe) 


the A,,,m © Z* has Lebesgue measure zero, then the set A = U Ay also 
m=1 


has Lebesgue measure zero. 


Fixed « > 0. For each m € Z", since A,, has Lebesgue measure zero, 
there is a countable collection Z,, = Gees x= Zt} of open rectangles 


such that 


[o-e) [o-e) 6 
Ae LU Um and 2, Vol mnt) Sa 


It follows that 


AC v v Um,~ and Sa) x = a — 
m=1 


m=1 k=1 m=1 k=1 


Appendix C. Riemann Integrability 634 


Notice that the collection 


co 


C= |) CS Ua x eee 2a), 
m=1 


is countable. This proves that A has Lebesgue measure zero. 


Now we proceed to the main theorem. 


Theorem C.5 Lebesgue- Vitali Theorem 


n 


Ket i — ] [la: b;| be a closed rectangle in R”. Given a bounded function 


i=1 
f :I1-R, let N be its set of discontinuities. Then f : I > R is Riemann 


integrable if and only if NV has Lebesgue measure zero. 


In this theorem, we only consider functions defined on closed rectangles. 


This is because the Riemann integrability of a function f : OD — R defined 


on a bounded set D is defined in terms of the Riemann integrability of its zero 


extension f : I > R to aclosed rectangle I that contains D. 
To prove the Lebesgue-Vitali theorem, we need a few lemmas. Given a bounded 


function f : 2 — R, we define the oscillation of f at a point x) € D as 


wij(X0) = lim, sup {f(u) — f(v)| u,v € B(xo,r) ND}. 
Notice that the set 
F, ={f(u) — f(v)|u,v € B(xo,7) ND} 


is a bounded subset of real numbers and —F, = F,.. Thus, the supremum of F,. 
always exists and is nonnegative. It is easy to see that 


Pg ed if ry <9. 
Therefore, sup F,. decreases as r —> 0* This implies that 
w(Xo) = lim sup F, = inf sup (f(u) — f(v)) 
r—0+ r>0 u,vEB(xo,r)ND 


exists and is nonnegative. 


Appendix C. Riemann Integrability 635 


Lemma C.6 


Let D be a subset of IR”, and let xp be a point in D. Assume that f : 9D > R 
is a bounded function. Then f is continuous at xo if and only if w (xo) = 0. 


First assume that f is continuous at x9. Given ¢ > 0, there is ad > 0 such 
that for all x € B(x9,6)ND, 


LF) — F(%0)| < 5. 
It follows that for all u,v € B(xo, 5) ND, 
Y(w) — FO) < =. 


Thus, if r < 0, 


0<sup{f(u) — f(v)|u,v € B(xo,r) ND} < . Ee 


This shows that 
w(Xo) = lim, sup {f(u) — f(v)|u,v € B(%o,r) ND} = 0. 


Conversely, assume that w (x9) = 0. Given e > 0, there is a d > 0 such 
that for allO0 <r < 0, 


sup {f(u) — f(v) |u,v € B(xo,r) ND} <e. 
If x isin B(xo, 6/2) ND, 
|f(x) — f(xo)| < sup {f(u) — f(v) | u,v € B(x, 6/2) 1D} <e. 


This proves that f is continuous at xo. 


Appendix C. Riemann Integrability 636 


Corollary C.7 


Let D be a subset of IR”, and let f : ® — R be a bounded function defined 


on D. If N is the set of discontinuities of f, then 


N = {x €D |wy;(x) > 0}. 


We also need the following proposition. 


Proposition C.8 


Let D be a compact subset of IR”, and let f : © — R bea bounded function 
defined on D. 


(a) For any a > 0, the set 


A= {x € D|wy(x) > a} 


is a compact subset of R”. 


(b) If NV is the set of discontinuities of f, then 


Pein 


k=1 


N= {xeo jlo) > zh. 


(c) The set NV’ has Lebesgue measure zero if and only if MV, has Jordan 


content zero for each k € Z*. 


Appendix C. Riemann Integrability 637 


Since D is compact, it is closed and bounded. For part (a), A C D implies 
A is bounded. To prove that A is compact, we only need to show that A is 


closed. This is equivalent to R” \ A is open. Notice that 


AU Ws. 


where 


U,=R"\D and U,=D\A. 


Since D is closed, U; is open. If x) € U2, then wy(xo) < a. Lete = 
a —Wy(Xo). Thene > 0. By definition of w(x), there is a d > 0 such that 
forallO <r <0, 


sup{ f(u) — f(v)|u,v € B(xo,r) ND} < ws(xo) te =a. 


Take c = 6/3. If x is in B(xo,c), and u, v are in B(x, c), then u, v are in 
B(Xo, 2c). Since 2c < 6, we find that 


w(x) < sup{f(u) — f(v) u,v € B(x,c) ND} 
< sup{f(u) — f(v) | u,v € B(x, 2c) ND} < a. 


This shows that B(xo,c) C Up. Hence, U2 is open. Since R” \ A is a union 


of two open sets, it is open. This completes the proof. 
Part (b) follows from Corollary C.7 and the identity 


w-Olhs) 


For part (c), if V has Lebesgue measure zero, then for any k € Z*, N;, also 
has Lebesgue measure zero. By part (a) and Proposition C.2, Nj, has Jordan 
content zero. Conversely, assume that V;, has Jordan content zero for each 
k € Z*. Then NV, has Lebesgue measure zero for each k € Z*. Part (b) 
and Proposition C.4 implies that VV also has Lebesgue measure zero. 


Now we can prove the Lebesgue- Vitali theorem. 


Appendix C. Riemann Integrability 638 


Proof of the Lebesgue- Vitali Theorem 


First we assume that f : I + R is Riemann integrable. Given k € Zt, we 
will show that the set 


N= {xeo jl) >; 


has Jordan content zero. By Proposition C.8, this implies that the set NV of 


discontiuities of f : I + R has Lebesgue measure zero. 


Fixed k € Z*. Given e > 0, since f : I + R is Riemann integrable, there 
is a partition P of I such that 


UGE) LUE) & 
ad ={3 € Jp| (IJ) AN, FO}. 


Ni = A, U Ag, 


Aj = (U m3) al Ni, 


Jed 


Jo= (U on) Np. 


JETp 
Notice that the set Az has Jordan content zero. Therefore, there are finitely 
many open rectangles U;,..., U;, such that 


Ag C Lu, and S vol (U) Oey 
l=1 l=1 


The set A, itself is contained in a finite union of open rectangles int J with 
J € &. Notice that 


>> (Ma(f) — ms(f)) vol (J) < U(f,P) - L(f,P) < 
JEL 


If J is in &, there is an xp € A, such that xg € int J. Since int J is an open 
set, there is a 6 > 0 such that B(x, 6) C J. 


Appendix C. Riemann Integrability 639 


Now, 
sup{f(u) — f(v)|u,v € B(x, 5)} 2 w¢(xo) 2 =. 
Therefore, 


My(f) — ms(f) 2 


for each J in .&. This implies that 


7 owl (int J) < me (M3(f) — m3(f)) vol (int J) < = 


JEW Jed 


Thus, 
S © vol (int J) < =. 
Jed 
Hence, 
4— intl |) em PU; |lal = my 


is a finite collection of open rectangles that covers \;,, and the sum of the 
volumes of the rectangles in & is less than e. This proves that \/;, indeed 
has Jordan content zero. 


Conversely, assume that NV has Lebesgue measure zero. Since f : I > 


is bounded, there is a positive number / such that 
|f(x)| <M for all x € I. 


Given € > 0, let k be a positive integer such that 


5, 2vol (I) 


k 
€ 


Proposition C.8 says that Vj, has Jordan content zero. Thus, there is a finite 
collection of open rectangles 4, = {U;|1 <1 < m} such that 


m ™ e 
Me cUU aC = a 
t=1 


l=1 


Let Po be a partition of I such that each rectangle J in Jp, lies entirely in 
the closure of one of the rectangles U;, 1 < 1 < m or it is disjoint from all 
thie We ee Met 


6 ={Je Jp, |JINU, =O forall 1 <1 <m}. 


Appendix C. Riemann Integrability 640 


NOG U int J. 


JETP) \C 


For each point x that is in I \ M;,, there is an r,, > 0 such that the open cube 


x + (—rx, 1x)" is contained in the open set R \ \y.. By taking a smaller r,, 
we can assume that 


sup (f(a) ~ f(v) [u,v € BUX, 2rx)} <7 


The ball B(x, 27) contains the cube Q = x + |—rx, rx|". Therefore, 


Malf) — m@(f) < F- 
The collection 
{x + (—Trx,Tx)” |x €1\M} 
is an open covering of the compact set 
f= | ak 
Jee 


Thus, there is a finite subcover {Vj,...,V,}. For1 < 7 < s, letI; = V;n1. 


Then we still have . 
Ave ibe 
Je@ Fal 


After renaming the rectangles, let 
{U7,|1<l<m}U{I;|1<7 <s} ={Wi, We,...,W}. 


Now let P be a partition of I so that each rectangle in Jp is either disjoint 
from the interior of all the W;, 1 < 7 < q, or is contained in one of the W;. 
Let 

Q={IJ€ Jp|I €U; for some 1 <1 <m}. 


Then 


So (My — my) vol (J) < om 3 vol (nl — om 3 vol (epee 


JED l=1 l=1 


Ss 
5 


Appendix C. Riemann Integrability 641 


For those J that is not in J, it is contained in one of the cubes x+|—rx, 7x|". 
Therefore, 


It follows that 


ey (My — my) vol (J) 


JETP\D 


This proves that 


Hence, f : I > R is Riemann integrable. 


References 642 


References 


[Abb15] 


[Apo74] 


[BS92] 


[CB84] 


[Fit09] 


[Rud76] 


[SCW20] 


[SS03] 


[Tao14] 


Stephen Abbott, Understanding analysis, second ed., Undergraduate 
Texts in Mathematics, Springer, New York, 2015. MR 3331079 


Tom M. Apostol, Mathematical analysis, second ed., Addison-Wesley 
Publishing Co., Reading, Mass.-London-Don Mills, Ont., 1974. MR 
0344384 


Robert G. Bartle and Donald R. Sherbert, Introduction to real analysis, 
second ed., John Wiley & Sons, Inc., New York, 1992. MR 1135107 


Ruel V. Churchill and James Ward Brown, Complex variables and 
applications, fourth ed., McGraw-Hill Book Co., New York, 1984. MR 
730937 


Patrick M. Fitzpatrick, Advanced calculus, second ed., American 
Mathematical Society, 2009. 


Walter Rudin, Principles of mathematical analysis, third ed., 
International Series in Pure and Applied Mathematics, McGraw-Hill 
Book Co., New York-Auckland-Diisseldorf, 1976. MR 0385023 


James Stewart, Daniel K. Clegg, and Saleem Watson, Calculus, ninth 
ed., Cengage Learning, 2020. 


Elias M. Stein and Rami Shakarchi, Fourier analysis, Princeton 
Lectures in Analysis, vol. 1, Princeton University Press, Princeton, NJ, 
2003, An introduction. MR 1970295 


Terence Tao, Analysis. II, third ed., Texts and Readings in 
Mathematics, vol. 38, Hindustan Book Agency, New Delhi, 2014. MR 
3310023 


References 643 


[Tao16] , Analysis. I, third ed., Texts and Readings in Mathematics, 


vol. 37, Hindustan Book Agency, New Delhi; Springer, Singapore, 
2016, Edectronic edition of [ MR3309891]. MR 3728289 


[Zor15] Vladimir A. Zorich, Mathematical analysis. I, second ed., Universitext, 
Springer-Verlag, Berlin, 2015, With Appendices A-F and new 
problems translated by Octavio Paniagua T. MR 3495809 


[Zor16] , Mathematical analysis. IT, second ed., Universitext, Springer, 


Heidelberg, 2016. MR 3445604 


